One Big Table: The Denormalized Approach to Data Modeling

Learn how One Big Table architecture simplifies data pipelines by combining all attributes into single wide denormalized tables.

published: March 27, 2026 reading time: 10 min read

One Big Table: The Denormalized Approach to Data Modeling

Every data modeling methodology has its advocates. Kimball dimensional modeling gives you stars. Data Vault gives you hubs and satellites. One Big Table gives you one table to rule them all.

OBT is exactly what it sounds like. Instead of splitting data into normalized fact and dimension tables, you combine everything into single wide tables, typically one per source entity. A customer table contains every attribute about a customer, denormalized from all source systems. An order table contains order data joined with customer attributes, product attributes, and store attributes.

This approach fell out of favor in traditional data warehousing circles, but it has made a comeback in modern data platforms, particularly those built on columnar storage and cloud data warehouses.

The Core Philosophy

The rationale behind OBT is pragmatic. Complexity has a cost. Every join requires understanding relationships, maintaining foreign keys, and debugging null values. Every schema change ripples across multiple tables.

OBT trades storage space and some data redundancy for query simplicity. When you want to analyze orders, you query the order table. All the context you need is already there. No joins required.

-- OBT style: one query, no joins
SELECT
    order_date,
    customer_name,
    customer_region,
    product_name,
    product_category,
    store_city,
    quantity,
    unit_price,
    total_amount
FROM rpt_orders
WHERE order_date >= '2026-01-01';

Compare to the equivalent star schema query joining fact_sales to dim_customer, dim_product, dim_store, and dim_date. The OBT query is simpler to write and understand.

Why OBT Makes Sense in Modern Data Warehouses

Cloud data warehouses like BigQuery, Snowflake, and Redshift are built for columnar storage and massively parallel processing. In these environments, denormalized tables are not the performance problem they once were.

Columnar storage compresses repeated values efficiently. If every row in your order table has the same customer_region values repeated, those compress well. Modern query engines can scan billions of rows quickly when the data is stored in a single location.

The operational simplicity is significant. One table to manage, one schema to understand, one location to query. When a new analyst joins the team, they can understand the data model without a diagram of twenty related tables.

Structure of an OBT

An OBT report table typically combines elements that would be spread across fact and dimension tables in a star schema.

CREATE TABLE rpt_orders (
    -- Order identifiers
    order_id BIGINT PRIMARY KEY,
    order_number VARCHAR(20),
    order_date DATE,
    order_datetime TIMESTAMP,

    -- Customer attributes (denormalized)
    customer_id INT,
    customer_name VARCHAR(200),
    customer_email VARCHAR(200),
    customer_region VARCHAR(100),
    customer_segment VARCHAR(50),
    customer_tier VARCHAR(20),

    -- Product attributes (denormalized)
    product_id INT,
    product_name VARCHAR(200),
    product_category VARCHAR(100),
    product_subcategory VARCHAR(100),
    product_brand VARCHAR(100),

    -- Store attributes (denormalized)
    store_id INT,
    store_name VARCHAR(200),
    store_city VARCHAR(100),
    store_region VARCHAR(100),
    store_format VARCHAR(50),

    -- Order metrics
    quantity_ordered INT,
    unit_price DECIMAL(10,2),
    discount_percent DECIMAL(5,2),
    discount_amount DECIMAL(10,2),
    total_amount DECIMAL(12,2),
    cost_of_goods DECIMAL(12,2),

    -- Order status
    order_status VARCHAR(50),
    is_returned BOOLEAN,

    -- Processing dates
    shipped_date DATE,
    delivered_date DATE,
    days_to_ship INT,
    days_to_deliver INT,

    -- Audit columns
    load_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    source_system VARCHAR(50)
);

Every attribute you need for order analysis is in one table. The analyst does not need to know which attributes come from which source system or how to join them.

Loading OBT Tables

The load process for OBT tables requires joining source data at load time.

def load_rpt_orders(source_orders, source_customers, source_products, source_stores):
    """Denormalize order data from multiple sources."""

    query = """
        SELECT
            o.order_id,
            o.order_number,
            o.order_date,
            o.quantity,
            o.unit_price,
            o.discount,
            o.total_amount,
            c.customer_id,
            c.customer_name,
            c.customer_region,
            p.product_id,
            p.product_name,
            p.category,
            s.store_id,
            s.store_city
        FROM staging_orders o
        LEFT JOIN staging_customers c ON o.customer_id = c.customer_id
        LEFT JOIN staging_products p ON o.product_id = p.product_id
        LEFT JOIN staging_stores s ON o.store_id = s.store_id
    """

    # Execute query and load into OBT
    result = execute_query(query)
    insert_into_rpt_orders(result)

The denormalization happens in the ETL layer. The reporting layer sees only flat, joined data.

Handling Changes with OBT

One trade-off of OBT is handling slowly changing dimensions. In a star schema, SCD strategies are explicit. In OBT, you need a different approach.

Approach 1: Latest Value Only

For attributes where historical accuracy is not critical, you overwrite with the latest values.

UPDATE rpt_orders
SET customer_tier = 'Gold',
    customer_region = 'Northeast'
WHERE customer_id = 12345;

This works when you only care about current state and do not need to analyze historical values.

Approach 2: Effective Date Columns

For attributes that need historical tracking, include effective date columns in the table.

ALTER TABLE rpt_orders ADD COLUMN customer_tier VARCHAR(20);
ALTER TABLE rpt_orders ADD COLUMN customer_tier_effective_date DATE;
ALTER TABLE rpt_orders ADD COLUMN customer_tier_expiry_date DATE;

The reporting query filters to the appropriate effective date range.

Approach 3: Separate History Table

Keep a full history in a separate table and load the latest state into the OBT.

-- History table: complete audit trail
CREATE TABLE hist_customer_changes (
    customer_id INT,
    attribute_name VARCHAR(50),
    old_value VARCHAR(200),
    new_value VARCHAR(200),
    change_date TIMESTAMP
);

-- OBT: latest values only
CREATE TABLE rpt_customer (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(200),
    customer_tier VARCHAR(20),
    last_updated TIMESTAMP
);

This separates the concerns of historical analysis and current state reporting.

Advantages of One Big Table

The appeal of OBT is straightforward:

Query simplicity. Analysts write single-table queries without joins. This reduces errors and speeds up ad-hoc analysis.

Faster development. New reports do not require understanding a complex schema of related tables. If the data is in the table, you query it.

Easier onboarding. New team members can be productive faster when there is only one table to understand.

Debugging is simpler. When data looks wrong, you check one table instead of tracing through multiple joins.

Cloud-native performance. Columnar cloud databases handle wide denormalized tables efficiently. Compression makes storage reasonable, and MPP execution makes queries fast.

Disadvantages and Trade-offs

OBT is not without critics:

Data redundancy. The same customer attributes appear on every order that customer places. Storage costs increase compared to normalized designs.

Update anomalies. When a customer moves to a new region, you must update every row for that customer. Miss one and you have inconsistent data.

Unclear ownership. In a star schema, dimension tables have clear owners. In OBT, who owns the customer attributes on the order table?

Loss of relationship information. OBT hides the relationships between entities. You cannot easily see what other orders a customer has placed without querying the orders table.

Difficulty in change management. Schema changes to a wide table are more impactful. Adding a column affects the entire table.

OBT in the Modern Data Stack

OBT has found renewed popularity with the rise of dbt and modern transformation layers.

In a dbt project, you might model your data as OBT:

-- models/reporting/rpt_orders.sql
{{ config(materialized='table') }}

SELECT
    o.order_id,
    o.order_date,
    c.customer_name,
    c.customer_region,
    p.product_name,
    p.product_category,
    s.store_city,
    o.quantity,
    o.total_amount
FROM {{ ref('stg_orders') }} o
LEFT JOIN {{ ref('stg_customers') }} c ON o.customer_id = c.customer_id
LEFT JOIN {{ ref('stg_products') }} p ON o.product_id = p.product_id
LEFT JOIN {{ ref('stg_stores') }} s ON o.store_id = s.store_id

The staging layer maintains normalized data. The reporting layer denormalizes into OBT tables. You get the best of both worlds: clean source data and simple reporting queries.

This layered approach addresses the update anomaly problem. The staging layer is the system of record. The OBT layer is a derived reporting view.

flowchart TB
    subgraph "Star Schema"
        F[fact_sales<br/>sale_id&#44; product_id&#44; customer_id]
        D1[dim_product<br/>product_id&#44; name&#44; category]
        D2[dim_customer<br/>customer_id&#44; name&#44; region]
        D3[dim_date<br/>date_id&#44; year&#44; month]
        F --> D1
        F --> D2
        F --> D3
    end
    subgraph "One Big Table"
        OBT[rpt_orders<br/>order_id&#44; customer_name&#44; customer_region&#44; product_name&#44; product_category&#44; store_city&#44; quantity&#44; total_amount]
    end

When OBT Works Well

OBT is a good fit when:

Query patterns are predictable. If users mostly query order data with customer and product context, OBT serves those queries well.

Analytics latency is acceptable. Denormalized tables are refreshed on a schedule. Real-time updates are more complex.

Storage costs are not prohibitive. Cloud storage is cheap, but denormalization multiplies storage requirements.

Team has strong data governance. Without the structural constraints of a star schema, data quality depends more on governance processes.

For organizations that prioritize query simplicity and fast development cycles over storage efficiency, OBT is a reasonable choice.

One Big Table Production Failure Scenarios

Update anomaly causing inconsistent customer region

A customer moves from “West” to “Northeast.” The ETL job runs an UPDATE on the rpt_orders table but only processes 99% of the customer’s rows—the job fails mid-way through. The customer now has “West” on older orders and “Northeast” on newer ones. Revenue reports by region are inconsistent until the next full refresh.

Mitigation: Use staging tables as the system of record. OBT is always derived from staging via a complete refresh or a MERGE operation, not incremental updates. If incremental updates are unavoidable, wrap the entire update in a transaction and monitor affected row counts.

SCD handling producing wrong historical totals

The team decides to track customer tier changes in rpt_orders using latest-value-only (Type 1). An analyst queries historical revenue by customer tier for the past year. The result shows all historical orders attributed to the customer’s current tier, not the tier they had at the time of purchase. Marketing sees inflated Gold customer revenue that never actually existed.

Mitigation: Be explicit about SCD limitations in documentation. For historical analysis, use effective date columns (Type 2 equivalent) or a separate history table. Do not let business users assume OBT supports historical SCD when it uses Type 1.

Schema change requiring full table rewrite

A new requirement adds 15 new columns to rict_orders. The ALTER TABLE runs for 4 hours on a 500GB table, during which the table is locked or significantly degraded. BI tools time out. Dashboard users see errors.

Mitigation: Add columns as a separate operation from data population. Add the column, backfill data in batches during low-traffic windows, then make the column visible to users. For very large tables, use a shadow-table approach: create the new table, switch reads to it atomically, then drop the old table.

Load job duplication inflating order counts

The ETL load is not idempotent. A rerun of yesterday’s load inserts duplicate rows into rpt_orders instead of replacing them. Every subsequent aggregation is double-counted. The issue goes undetected for three days until an analyst notices revenue is exactly 2x expected.

Mitigation: Always design OBT loads to be idempotent. Use a truncate-and-reload pattern for full refresh, or use a merge (UPSERT) with proper key handling. Add row count validation checks after every load: compare count and sum(amount) against source systems.

One Big Table Quick Recap

OBT combines fact and dimension data into single wide tables for query simplicity—no joins needed.
Works well on columnar cloud warehouses (BigQuery, Snowflake, Redshift) where denormalization is efficient.
Trade-offs: update anomalies require careful handling, storage redundancy is unavoidable, schema changes are expensive.
SCD handling is limited: latest-value-only destroys history; add effective date columns for point-in-time correctness.
Layer OBT on top of normalized staging tables—never use OBT as the system of record for source data.
Validate row counts and sums after every load; design all loads to be idempotent.

Compare this to Kimball Dimensional Modeling for structured star schemas or Data Vault for enterprise-scale auditability.

One Big Table: The Denormalized Approach to Data Modeling

One Big Table: The Denormalized Approach to Data Modeling

The Core Philosophy

Why OBT Makes Sense in Modern Data Warehouses

Structure of an OBT

Loading OBT Tables

Handling Changes with OBT

Approach 1: Latest Value Only

Approach 2: Effective Date Columns

Approach 3: Separate History Table

Advantages of One Big Table

Disadvantages and Trade-offs

OBT in the Modern Data Stack

When OBT Works Well

One Big Table Production Failure Scenarios

Update anomaly causing inconsistent customer region

SCD handling producing wrong historical totals

Schema change requiring full table rewrite

Load job duplication inflating order counts

One Big Table Quick Recap

Category

Tags

Related Posts

Data Vault: Scalable Enterprise Data Modeling

Kimball Dimensional Modeling: Star Schema Techniques

Slowly Changing Dimensions: History in Data Warehouses