GraphQL vs REST: Choosing the Right API Paradigm

Compare GraphQL and REST APIs, understand when to use each approach, schema design, queries, mutations, and trade-offs between the two paradigms.

published: reading time: 36 min read author: GeekWorkBench

Introduction

GraphQL came out of Facebook in 2012 and reached open source in 2015. The pitch: instead of multiple endpoints returning fixed data structures, you get one endpoint where the client asks for exactly what it needs.

That sounds simple, but it changes how you think about APIs. REST is resource-oriented — you fetch what the server decides. GraphQL is query-oriented — the client decides. Each model has strengths and trade-offs.

graph LR
    A[Client] -->|"POST /graphql"| B[GraphQL Server]
    B --> C[Schema]
    C --> D[Resolvers]
    D --> E[Data Sources]

The client specifies exactly what data it needs in the query. The server returns exactly that data, nothing more.


Core Concepts

REST: Multiple Endpoints

REST uses different endpoints for different resources:

# REST: Multiple endpoints
GET /users/123
GET /users/123/posts
GET /users/123/followers

Each endpoint returns a fixed data structure. If you need a user’s posts with follower counts, you might need multiple requests or get more data than necessary.

GraphQL: Single Endpoint

GraphQL uses a single endpoint with flexible queries:

# GraphQL: Single endpoint, flexible query
POST /graphql

query {
  user(id: 123) {
    name
    posts {
      title
    }
    followersCount
  }
}

One request gets exactly the data you need.


Query Patterns

Queries

REST Queries

# REST: Get user and their posts
GET /users/123
GET /users/123/posts

GraphQL Queries

# GraphQL: One request, precise data
query GetUserWithPosts($userId: ID!) {
  user(id: $userId) {
    name
    email
    posts {
      title
      createdAt
    }
  }
}

GraphQL queries run in parallel automatically. If you request multiple fields, GraphQL fetches them concurrently.


Mutation Patterns

Mutations

REST Mutations

# REST: Create, update, delete with different HTTP methods
POST /users
PUT /users/123
DELETE /users/123

GraphQL Mutations

# GraphQL: Mutations are explicit
mutation CreateUser($input: CreateUserInput!) {
  createUser(input: $input) {
    id
    name
    email
  }
}

mutation UpdateUser($id: ID!, $input: UpdateUserInput!) {
  updateUser(id: $id, input: $input) {
    id
    name
    email
  }
}

mutation DeleteUser($id: ID!) {
  deleteUser(id: $id)
}

Subscription Architecture

GraphQL Subscriptions Deep Dive

GraphQL has native Subscriptions for real-time updates, unlike REST which relies on polling or webhooks.

Subscription Protocol

Subscriptions use WebSockets under the hood. The client subscribes once, and the server pushes updates:

sequenceDiagram
    Client->>Server: SUBSCRIBE mutation (WebSocket)
    Server-->>Client: Confirm subscription
    Server->>Database: Watch for changes
    Database-->>Server: New data event
    Server-->>Client: Push: { data: { postCreated: {...} } }
    Server->>Database: Continue watching
    Database-->>Server: Another event
    Server-->>Client: Push: { data: { postCreated: {...} } }

Schema Definition

type Subscription {
  postCreated: Post!
  postUpdated(id: ID!): Post!
  userJoined(roomId: ID!): User!
}

type Mutation {
  createPost(input: CreatePostInput!): Post!
  updatePost(id: ID!, input: UpdatePostInput!): Post!
  joinRoom(roomId: ID!): Room!
}

type Query {
  posts: [Post!]!
}

Subscription Resolver Implementation

import { PubSub } from 'graphql-subscriptions';

const pubsub = new PubSub();

// Define event names as constants
const POST_CREATED = 'POST_CREATED';
const POST_UPDATED = 'POST_UPDATED';

// Resolvers
const resolvers = {
  Subscription: {
    postCreated: {
      subscribe: () => pubsub.asyncIterator([POST_CREATED]),
    },
    postUpdated: {
      subscribe: (_, { id }) => pubsub.asyncIterator(`${POST_UPDATED}_${id}`),
    },
  },
  Mutation: {
    createPost: (_, { input }, { pubsub }) => {
      const post = await db.posts.create(input);

      // Publish the event to all subscribers
      pubsub.publish(POST_CREATED, { postCreated: post });

      return post;
    },
    updatePost: (_, { id, input }, { pubsub }) => {
      const post = await db.posts.update(id, input);

      // Publish to specific post subscribers
      pubsub.publish(`${POST_UPDATED}_${id}`, { postUpdated: post });

      return post;
    },
  },
};

Subscription Production Patterns

Filtering Subscriptions

Not all clients should receive all updates. Filter by authorization, room membership, or other criteria:

type Subscription {
  # Only gets events the user is authorized to see
  documentUpdated(documentId: ID!): Document!
}

const resolvers = {
  Subscription: {
    documentUpdated: {
      subscribe: async function* (_, { documentId }, context) {
        // Check authorization
        if (!await context.user.canView(documentId)) {
          throw new Error('Not authorized');
        }

        // Create async generator that yields when matching events
        const eventEmitter = context.documentEvents.filter(
          event => event.documentId === documentId
        );

        for await (const event of eventEmitter) {
          yield { documentUpdated: event.document };
        }
      },
    },
  },
};

Production Subscription Architecture

graph TB
    Client1[WebSocket Client] --> LB[Load Balancer]
    Client2[WebSocket Client] --> LB
    Client3[WebSocket Client] --> LB
    LB --> Server1[GraphQL Server 1]
    LB --> Server2[GraphQL Server 2]
    Server1 --> RedisPubSub[Redis Pub/Sub]
    Server2 --> RedisPubSub
    RedisPubSub --> MessageBroker[Redis / RabbitMQ]
    MessageBroker --> Server1
    MessageBroker --> Server2

For multi-server deployments, use Redis Pub/Sub or a message broker:

import { RedisPubSub } from "graphql-redis-subscriptions";

const pubsub = new RedisPubSub({
  connection: {
    host: process.env.REDIS_HOST,
    port: 6379,
    retryStrategy: (times) => Math.min(times * 50, 2000),
  },
});

// Use Redis-backed pub/sub for horizontal scaling
const resolvers = {
  Subscription: {
    postCreated: {
      subscribe: () => pubsub.asyncIterator([POST_CREATED]),
    },
  },
};

Subscription Gotchas and Mitigations

IssueProblemMitigation
Memory leaksSubscriptions hold server resources indefinitelySet max, maxAge on PubSub; implement client heartbeat
Reconnection stormsClients reconnect in burst after outageImplement exponential backoff; deduplicate on reconnect
Authorization driftUser loses access but keeps subscriptionRe-validate authorization periodically; emit “kicked” event
Query complexitySubscription queries are as complex as regular queriesApply same complexity limits; consider simplified subscription payloads
// Limit subscription count per connection
const pubsub = new PubSub({
  maxSubscriptionPerConnection: 100,
});

// Set TTL for subscription events
const pubsub = new PubSub({
  eventTTL: 10, // seconds
});

When to Use Subscriptions vs Polling vs Webhooks

ApproachLatencyScalabilityUse Case
SubscriptionsInstantMediumUI updates, live collaboration
PollingPoll intervalHighInfrequent updates, simple clients
WebhooksNear-instantHighCross-service communication
Server-Sent EventsNear-instantMediumOne-way server push, simpler than WS

For most GraphQL use cases: subscriptions for real-time UI, REST webhooks for cross-service events.


Persisted Queries

Persisted Queries & Query Whitelisting

By default, GraphQL accepts any query string sent by clients. This flexibility is powerful but creates security and performance problems. Persisted queries solve both.

The Problem with Dynamic Queries

Every GraphQL request sends the full query string:

# Every request - even identical ones - sends the full query
POST /graphql
{
  "query": "query GetUser { user(id: $id) { name email posts { title } } }",
  "variables": { "id": "123" }
}

This means:

  • Security: Attackers can send complex or malicious queries
  • Performance: Server parses and validates the same queries repeatedly
  • Bandwidth: Large queries consume unnecessary network overhead

How Persisted Queries Work

Instead of sending the full query, clients send a hash that references a pre-registered query:

# Instead of full query...
POST /graphql
{ "query": "{ user(id: 123) { name email } }" }

# Client sends query ID (SHA-256 hash)
POST /graphql
{ "extensions": { "persistedQuery": { "version": 1, "sha256Hash": "a1b2c3d4e5f6..." } } }

Server looks up the hash, executes the pre-validated query.

Apollo Server Implementation

import { ApolloServer } from "@apollo/server";
import { hashQuery } from "@apollo/utils.usestripping";
import { LocalCache } from "apollo-server-cache-local";

// 1. Define persisted query plugin
const createPersistedQueryPlugin = (queryRegistry) => ({
  async didResolveOperation({ operation, document }) {
    // Skip if client sent full query (add to registry)
    if (!operation.extensions?.persistedQuery) {
      const hash = hashQuery(document);
      queryRegistry.set(hash, operation.document);
    }
  },
});

// 2. Create server with plugin
const server = new ApolloServer({
  typeDefs,
  resolvers,
  plugins: [createPersistedQueryPlugin(queryRegistry)],

  // Reject unknown queries (not in registry)
  allowDynamicPersistedQueries: false, // Default: false in production

  // For development, allow full queries too
  // persistedQueries: {
  //   cache: new LocalCache({ ttl: 3600 }),
  // },
});

Building the Query Registry

// Build-time: generate registry from client queries
// (run as part of your CI/CD build)

import fs from "fs";
import path from "path";
import { hashQuery, parse } from "graphql";

const QUERIES_DIR = "./src/queries";
const REGISTRY_FILE = "./query-registry.json";

function buildQueryRegistry() {
  const registry = {};

  // Find all .graphql files
  const queryFiles = glob.sync(`${QUERIES_DIR}/**/*.graphql`);

  for (const file of queryFiles) {
    const content = fs.readFileSync(file, "utf-8");
    const document = parse(content);

    // Extract operation names
    for (const definition of document.definitions) {
      if (definition.kind === "OperationDefinition") {
        const hash = hashQuery(document); // SHA-256 of operation
        registry[hash] = {
          operationName: definition.name?.value || "anonymous",
          file,
          query: content,
        };
      }
    }
  }

  // Write registry to file (upload to CDN/deployment)
  fs.writeFileSync(REGISTRY_FILE, JSON.stringify(registry, null, 2));

  console.log(`Registered ${Object.keys(registry).length} queries`);
}

buildQueryRegistry();

Client-Side Query Integration

Client-Side Integration

// React Apollo Client
import { createPersistedQueryLink } from "@apollo/client/link/persisted";
import { sha256 } from "crypto-hash";

const persistedLink = createPersistedQueryLink({ sha256 });

// Combine with http link
const httpLink = new HttpLink({ uri: "/graphql" });
const link = persistedLink.concat(httpLink);

// Or with Apollo Client 3+
import { ApolloClient, InMemoryCache, createHttpLink } from "@apollo/client";
import { ApolloLink } from "@apollo/client/link";
import { createPersistedQueryLink } from "@apollo/client/link/persisted";

const httpLink = createHttpLink({ uri: "/graphql" });
const persistedLink = createPersistedQueryLink({ sha256 });
const link = ApolloLink.from([persistedLink, httpLink]);

const client = new ApolloClient({
  cache: new InMemoryCache(),
  link,
});

Query Whitelisting (Strict Mode)

For maximum security, reject any query not in your pre-approved registry:

// Server-side: only allow registered queries
const server = new ApolloServer({
  // ... other config

  // CRITICAL: Reject all unregistered queries
  allowDynamicPersistedQueries: false, // ← This is the key setting

  plugins: [
    {
      async didResolveOperation({ operation }) {
        const { sha256Hash } = operation.extensions?.persistedQuery || {};

        if (!sha256Hash) {
          throw new Error(
            "Persisted query required. Send persistedQuery extension.",
          );
        }

        if (!queryRegistry.has(sha256Hash)) {
          throw new Error(`Unknown query: ${sha256Hash}`);
        }

        // Replace operation with registered version
        operation.document = queryRegistry.get(sha256Hash);
      },
    },
  ],
});

Benefits Summary

BenefitWithout Persisted QueriesWith Persisted Queries
SecurityFull query injection riskOnly pre-approved queries run
Parse overheadEvery request parsedParsed once at build time
NetworkFull query string each request64-char hash only
CDN cachingNot cacheablePersisted queries can use GET
Rate limitingHard to fingerprintPer-query rate limits possible

When to Use

Use CaseRecommendation
Public APIRequired—prevent abuse
Mobile appsHighly recommended—bandwidth savings
Internal toolsOptional—still helps with parsing overhead
DevelopmentSkip—full queries for flexibility

Data Fetching and N+1 Problem

Data Fetching

Overfetching and Underfetching

REST often leads to overfetching (getting more data than needed) or underfetching (needing multiple requests):

# REST: Gets more data than needed
GET /users/123
# Returns: { id, name, email, created_at, updated_at, profile_url, bio, ... }

# REST: Multiple requests for related data
GET /users/123          # User info
GET /users/123/posts    # User's posts
GET /posts/456/comments # Comments for a specific post

GraphQL solves both problems:

# GraphQL: Exact data
query {
  user(id: 123) {
    name # Only what you need
    posts {
      title # Only what you need
    }
  }
}

N+1 Problem

GraphQL can suffer from the N+1 problem: fetching a list of users, then making a separate request for each user’s posts:

# This could trigger many database queries
query {
  users {
    name
    posts {
      title # Triggers query for each user's posts
    }
  }
}

DataLoader solves this by batching requests.


DataLoader Patterns Deep Dive

The N+1 problem is GraphQL’s most notorious performance pitfall. DataLoader is Facebook’s official solution—a batching and caching library that coalesces multiple requests into fewer database queries.

DataLoader Deep Dive

How DataLoader Works

DataLoader works by queueing up individual field requests during query execution, then dispatching them as a single batched query when the field is accessed.

import DataLoader from "dataloader";

// Create a batch function that fetches users by IDs
const userLoader = new DataLoader(async (ids) => {
  // This runs once for all pending user lookups
  const users = await db.users.findMany({ where: { id: { in: ids } } });

  // DataLoader expects results in the same order as input IDs
  const userMap = new Map(users.map((u) => [u.id, u]));
  return ids.map((id) => userMap.get(id) || null);
});

// In your resolver
const resolvers = {
  User: {
    posts: (user, args, context) => context.postLoader.load(user.id),
  },
  Post: {
    author: (post, args, context) => context.userLoader.load(post.authorId),
  },
};

Batching vs Caching

DataLoader provides two distinct benefits:

// Batching: Multiple users requested in same query
// Query: { users { posts { author } } }
// Only ONE batch call to posts for ALL users
const userLoader = new DataLoader(async (userIds) => {
  // Single query: SELECT * FROM posts WHERE authorId IN (...)
  const allPosts = await Post.findMany({ authorId: { in: userIds } });

  // Group by authorId
  const postsByAuthor = allPosts.reduce((acc, post) => {
    (acc[post.authorId] ||= []).push(post);
    return acc;
  }, {});

  return userIds.map((id) => postsByAuthor[id] || []);
});

// Caching: Same user requested multiple times in query
// Query: { user(id: 1) { author { name } } }
//         { user(id: 1) { posts { title } } }
// The user is loaded once, second request hits cache

Common DataLoader Gotchas

1. Cache key collisions with nullable types

// Problem: null and id:123 could share cache if not careful
// DataLoader uses the key directly—ensure consistent types
const loader = new DataLoader((keys) => batchLoad(keys));

// Good: always return consistent types
load(userId); // string
load(parseInt(id, 10)); // number - could collide!

// Better: normalize to consistent type
const loader = new DataLoader((keys) => batchLoad(keys.map((k) => String(k))), {
  cacheKeyFn: (k) => String(k),
});

2. Memoization vs database freshness

// DataLoader caches per-request by default
// For long-running servers, use Redis/Memcached

import Redis from "ioredis";

const redis = new Redis();

// Custom cache map with TTL
const createPersistentLoader = (batchFn, ttlMs = 60000) => {
  const cache = new Map();

  return new DataLoader(async (keys) => {
    const now = Date.now();
    const results = await Promise.all(
      keys.map(async (key) => {
        const cached = cache.get(key);
        if (cached && now - cached.timestamp < ttlMs) {
          return cached.value;
        }
        return null; // Let DataLoader handle batch miss
      }),
    );

    // Batch load uncached keys
    const uncachedKeys = keys.filter((_, i) => results[i] === null);
    if (uncachedKeys.length > 0) {
      const uncachedResults = await batchFn(uncachedKeys);
      uncachedResults.forEach((value, i) => {
        cache.set(uncachedKeys[i], { value, timestamp: now });
      });
    }

    return results.map((cached, i) =>
      cached !== null
        ? cached.value
        : uncachedResults[uncachedKeys.indexOf(keys[i])],
    );
  });
};

3. Handling partial failures in batches

// Some IDs fail, some succeed - handle gracefully
const userLoader = new DataLoader(async (ids) => {
  const users = await User.findMany({ where: { id: { in: ids } } });
  const userMap = new Map(users.map((u) => [u.id, u]));

  return ids.map((id) => {
    const user = userMap.get(id);
    if (!user) {
      // Return Error for missing, not null
      // This preserves the error in GraphQL response
      return new Error(`User ${id} not found`);
    }
    return user;
  });
});

DataLoader with Different Data Sources

// REST API as a data source
const remoteServiceLoader = new DataLoader(async (ids) => {
  const responses = await Promise.all(
    ids.map((id) => fetch(`/api/users/${id}`).then((r) => r.json())),
  );
  return responses;
});

// MongoDB with aggregation pipeline
const mongoLoader = new DataLoader(async (objectIds) => {
  const results = await User.aggregate([
    { $match: { _id: { $in: objectIds } } },
    {
      $lookup: {
        from: "posts",
        localField: "_id",
        foreignField: "authorId",
        as: "posts",
      },
    },
  ]);

  const resultMap = new Map(results.map((r) => [r._id.toString(), r]));
  return objectIds.map((id) => resultMap.get(id.toString()) || null);
});

Error Handling

Error Handling Overview

REST Errors

REST uses HTTP status codes:

HTTP/1.1 404 Not Found
Content-Type: application/json

{"error": "User not found"}

GraphQL Errors

GraphQL returns 200 OK even for errors. Errors are in the response body:

{
  "data": null,
  "errors": [
    {
      "message": "User not found",
      "locations": [{ "line": 3, "column": 5 }],
      "path": ["user"]
    }
  ]
}

This is controversial. Some prefer HTTP status codes for errors.


Caching Strategies

Effective caching is critical for performance in both REST and GraphQL, but each requires different strategies.

REST Caching

REST works well with HTTP caching:

GET /users/123
Cache-Control: max-age=3600
ETag: "v1"

CDNs, browser caches, and libraries like React Query handle REST caching well.

GraphQL Caching

GraphQL POST requests are harder to cache by default. Solutions:

  • Normalized caching with Apollo Client or Relay
  • Persisted queries that become GET requests
  • Response caching at the CDN level

Client-Side Caching Strategies

Server-side caching for GraphQL is tricky because POST requests with dynamic queries don’t benefit from URL-based caching. Client-side caching becomes essential.

Apollo Client Cache

Apollo Client 3 uses a normalized in-memory cache with automatic cache updates:

import { ApolloClient, InMemoryCache, makeVar } from '@apollo/client';

// Reactive variables for local state
export const cartItemsVar = makeVar<string[]>([]);

// Configure normalized cache
const cache = new InMemoryCache({
  typePolicies: {
    // Customize field-level read/write
    User: {
      fields: {
        // Automatically merge paginated posts
        posts: {
          keyArgs: false, // Same posts field for all users
          merge(existing = [], incoming, { args }) {
            // Cursor-based pagination merge
            return {
              ...incoming,
              items: [...(existing.items || []), ...incoming.items],
              cursor: incoming.cursor,
            };
          },
        },
        // Real-time: update cache when subscription fires
        notifications: {
          merge(existing, incoming) {
            return incoming; // Replace on each notification
          },
        },
      },
    },
    Query: {
      fields: {
        // Debounce duplicate queries
        searchUsers: {
          keyArgs: ['query'],
          mergeLimit: 1, // Only keep most recent
        },
      },
    },
  },
});

const client = new ApolloClient({ cache, link });

Cache Normalization

By default, Apollo denormalizes responses. For apps with related entities, normalize for consistency:

const cache = new InMemoryCache({
  // Every User and Post stored by their ID
  dataIdFromObject: (object) => {
    switch (object.__typename) {
      case "User":
        return `User:${object.id}`;
      case "Post":
        return `Post:${object.id}`;
      default:
        return object.id;
    }
  },
});

// Now cache handles updates automatically
// If Post with id:123 is updated via mutation,
// all queries showing that post update automatically

Advanced Caching Patterns

Relay Cursor Connections

For paginated data, Relay Connections spec provides standardized pagination:

# Standardized connection pagination
query {
  user(id: "123") {
    postsConnection(first: 10, after: "cursor123") {
      edges {
        node {
          id
          title
        }
        cursor
      }
      pageInfo {
        hasNextPage
        endCursor
      }
    }
  }
}
// Apollo cache understands connection structure
const cache = new InMemoryCache({
  typePolicies: {
    User: {
      fields: {
        postsConnection: {
          // Apollo handles cursor-based pagination automatically
          // when using apollo-utilities
          keyArgs: false,
        },
      },
    },
  },
});

// Pagination query
const { data, fetchMore } = useQuery(POSTS_QUERY, {
  variables: { first: 10 },
});

// Load more
const loadMore = () => {
  return fetchMore({
    variables: {
      after: data.user.postsConnection.pageInfo.endCursor,
    },
    updateQuery: (prev, { fetchMoreResult }) => {
      if (!fetchMoreResult) return prev;

      return {
        user: {
          ...prev.user,
          postsConnection: {
            ...fetchMoreResult.user.postsConnection,
            edges: [
              ...prev.user.postsConnection.edges,
              ...fetchMoreResult.user.postsConnection.edges,
            ],
          },
        },
      };
    },
  });
};

Cache Invalidation Patterns

GraphQL cache invalidation is trickier than REST because the server doesn’t know what’s cached:

// Pattern 1: Cache as source of truth - mutate cache directly
const [createPost] = useMutation(CREATE_POST_MUTATION, {
  update: (cache, { data: { createPost } }) => {
    // Read existing query
    const existing = cache.readQuery({ query: USER_POSTS_QUERY });

    // Write new data to cache
    cache.writeQuery({
      query: USER_POSTS_QUERY,
      data: {
        user: {
          ...existing.user,
          posts: [createPost, ...existing.user.posts],
        },
      },
    });
  },
});

// Pattern 2: Evict and refetch
const [deletePost] = useMutation(DELETE_POST_MUTATION, {
  // Evict specific post from cache
  refetchQueries: [{ query: USER_POSTS_QUERY }],
  // Or evict specific key
  awaitRefetchQueries: true,
});

// Pattern 3: Using @client directive for local-only fields
const cache = new InMemoryCache();
const client = new ApolloClient({ cache });

// Define local-only field
const typeDefs = `
  type User {
    isOnline: Boolean @client
    cartItems: [ID!] @client
  }
`;

// Read/write local state
const { data } = useQuery(GET_USER);
const [setOnline] = useMutation(SET_ONLINE_MUTATION);

// Update local state directly
setOnline({ variables: { isOnline: true } });
// Apollo updates cache, UI reacts automatically

Cache Performance Tips

PatternUse WhenBenefit
keyArgs: ['field']Multiple similar queriesCache reused across components
merge(existing, incoming)PaginationAppend to lists correctly
read functionTransform dataFormat dates, combine fields client-side
@client directiveLocal-only stateNo server roundtrip for UI state
cache-and-network fetch policyReal-time updatesShow cached immediately, update from server

Comparison Table

AspectRESTGraphQL
Data fetchingMultiple requestsSingle request
Data shapeFixed per endpointClient specifies
TypingDocumentationSchema enforced
CachingHTTP cachingCustom caching
Learning curveLowerHigher
ToolingMatureEvolving
Error handlingHTTP status codes200 + error body
OverfetchingCommonAvoided

Combining Both

You do not have to choose one exclusively. Some teams use REST for simple operations and GraphQL for complex data requirements.

# REST for simple operations
GET /health
GET /config

# GraphQL for complex data
POST /graphql

Schema Federation

As GraphQL usage scales across multiple teams, a single monolithic schema becomes unwieldy. Schema Federation and Schema Stitching are approaches for composing distributed GraphQL schemas.

Schema Federation (Apollo)

Federation decomposes a schema into independent subgraphs that can be developed and deployed separately:

graph TB
    Client[GraphQL Client] --> Gateway[Apollo Gateway]
    Gateway --> Users[Users Subgraph]
    Gateway --> Orders[Orders Subgraph]
    Gateway --> Products[Products Subgraph]
    Users --> UsersDB[(Users DB)]
    Orders --> OrdersDB[(Orders DB)]
    Products --> ProductsDB[(Products DB)]
# products subgraph - src/subgraphs/products.ts
import { gql, Subgraph } from '@apollo/subgraph';

export const productsSubgraph: Subgraph = {
  name: 'products',
  typeDefs: gql`
    type Product @key(fields: "id") {
      id: ID!
      name: String!
      price: Float!
      reviews: [Review!]!
    }

    type Review {
      id: ID!
      rating: Int!
      comment: String!
    }

    extend type Query {
      product(id: ID!): Product
      productsOnSale: [Product!]!
    }
  `,
  resolvers: {
    Product: {
      // Resolve reviews from external subgraph
      __resolveReference: (product) => {
        return { __typename: 'Product', id: product.id };
      },
    },
    Query: {
      product: (_, { id }) => getProductById(id),
      productsOnSale: () => getProductsOnSale(),
    },
  },
};
# users subgraph extends Product with reviews
# src/subgraphs/users.ts
import { gql, Subgraph } from '@apollo/subgraph';

export const usersSubgraph: Subgraph = {
  name: 'users',
  typeDefs: gql`
    extend type Product @key(fields: "id") {
      id: ID! @external
      reviews: [Review!]! # Defined in products, extended here
    }

    type Review @key(fields: "id") {
      id: ID! @external
      author: User!
    }

    type User {
      id: ID!
      name: String!
      reviews: [Review!]!
    }
  `,
};

Schema Stitching (Legacy)

Stitching merges schemas at the boundary level, often with custom logic:

import { mergeSchemas } from "@graphql-tools/schema";
import { makeExecutableSchema } from "@graphql-tools/schema";

const usersSchema = makeExecutableSchema({
  typeDefs: usersTypeDefs,
  resolvers: usersResolvers,
});

const ordersSchema = makeExecutableSchema({
  typeDefs: ordersTypeDefs,
  resolvers: ordersResolvers,
});

// Merge schemas
const stitchedSchema = mergeSchemas({
  schemas: [usersSchema, ordersSchema],

  // Custom resolvers for cross-schema references
  resolvers: (mergeInfo) => ({
    User: {
      orders: {
        resolve(user, args, context, info) {
          return mergeInfo.delegateToSchema({
            schema: ordersSchema,
            operation: "query",
            fieldName: "ordersByUser",
            args: { userId: user.id },
            info,
          });
        },
      },
    },
  }),
});

Federation vs Stitching

AspectFederationStitching
ArchitectureGateway + autonomous subgraphsCentralized merged schema
Schema ownershipTeams own their typesCentral team owns merged schema
DeploymentIndependent subgraph deploysFull redeploy on changes
Query planningGateway routes to subgraphsStitching layer plans queries
@key directiveEntities shared across subgraphsType merging for shared types
Production maturityBattle-tested at scale (Netflix, Airbnb)More complex, less common now

When to Use Federation

Federation shines when:

  • Multiple teams own different domain areas
  • Services are already microservices
  • Independent deployment is important
  • You want clear ownership boundaries
# Example: User service owns User type, can reference Product
type User {
  id: ID!
  name: String!
  # Products this user has purchased - reference to Product entity
  purchasedProducts: [Product!]! @requires(fields: "id")
}

Hybrid Pattern: REST + GraphQL Federation

You can federate REST APIs alongside GraphQL subgraphs:

import { ApolloGateway } from "@apollo/gateway";
import { RemoteGraphQLDataSource } from "@apollo/gateway";

const gateway = new ApolloGateway({
  serviceList: [
    { name: "users", url: "http://users-service/graphql" },
    { name: "products", url: "http://products-service/graphql" },
    // REST API exposed as GraphQL via REST Data Source
    { name: "inventory", url: "http://inventory-service/graphql" },
  ],

  // Custom data source for REST
  dataSources: () => ({
    inventory: new RestDataSource("http://inventory-service/api"),
  }),
});

When to Choose REST vs GraphQL

When to Choose REST

REST works well when:

  • Your API is simple with predictable data requirements
  • You need HTTP caching
  • You are building public APIs consumed by many clients
  • Your team is familiar with REST
  • You need simple documentation (just list endpoints)

Examples: CRUD applications, simple CRUD APIs, public APIs for third-party developers

When to Choose GraphQL

GraphQL works well when:

  • Clients need different data shapes
  • Mobile apps needing minimal data transfer
  • Complex domains with many related entities
  • Rapid iteration with frontend teams
  • You want strong typing and schema validation

Examples: Mobile apps, complex dashboards, microservices with varying client needs


Trade-off Analysis

Decision Framework

FactorRESTGraphQL
Data FetchingMultiple endpoints; over/underfetching commonSingle endpoint; exact data shapes
CachingNative HTTP caching; CDN-friendlyCustom client-side caching; persisted queries for CDN
Type SafetyNo enforced schema; OpenAPI optionalStrongly typed schema; self-documenting
Learning CurveSimpler for beginners; familiar HTTP conceptsSteeper; requires understanding of queries and resolvers
ToolingMature ecosystem; standard HTTP debuggingGraphQL-specific IDEs; introspection for discovery
PerformancePredictable; caching reduces loadN+1 risk without DataLoader; complex query optimization
Real-timePolling or webhooks; not nativeNative subscriptions via WebSockets
SecurityEndpoint-based rate limiting; familiar patternsQuery complexity limits; introspection control
Schema EvolutionVersioned APIs; breaking changes managedAdditive changes; @deprecated directive
Team OwnershipEndpoint-per-team; clear boundariesSchema ownership; federation for scaling

Coexistence Pattern

Many organizations use both. A pragmatic approach:

// api/index.js
import { createServer } from "./api-server";

// REST for simple, stable resources
app.use("/api/users", usersRestRouter);
app.use("/api/health", healthCheckRouter);

// GraphQL for complex, client-driven data
app.use("/api/graphql", graphqlMiddleware);

// Bridge: REST endpoints call GraphQL resolvers internally
app.get("/api/users/:id/summary", async (req, res) => {
  const result = await graphql.execute({
    query: `query UserSummary($id: ID!) {
      user(id: $id) { name email role }
    }`,
    variables: { id: req.params.id },
  });
  res.json(result.data);
});

When to Migrate

SignalAction
Mobile app requests 15+ REST calls per screenConsider GraphQL
Frontend team blocked by backend endpoint paceGraphQL gives frontend autonomy
Complex data graphs with varying client needsGraphQL shines here
Heavy HTTP caching requirementsREST likely sufficient
Simple CRUD with predictable data shapesREST is probably fine
Third-party public APIREST for stability and caching

Production Failure Scenarios

FailureImpactMitigation
Query complexity explosionServer overwhelmed; possible DoSImplement query depth limiting; set complexity budgets
N+1 query problemDatabase flooded with queries; slow responsesUse DataLoader for batching; optimize resolvers
Schema introspection abuseInformation leakage; enumeration attacksDisable introspection in production; restrict access
Subscription memory leaksServer memory grows; eventual crashSet subscription limits; implement timeouts
Mutation race conditionsData inconsistency between related operationsImplement optimistic locking; use transactions
Persisted query abuseAttackers pre-store malicious queriesValidate persisted query hashes; rate limit
Error masking with 200 OKErrors hidden; debugging difficultReturn proper error status codes; use extensions

Observability Checklist

Metrics

  • Query rate by operation type (query/mutation/subscription)
  • Query complexity distribution
  • Request duration by operation and field
  • Error rate by error type
  • DataLoader batch efficiency (cache hit ratio)
  • Schema introspection requests
  • Subscription active count
  • Query depth and breadth distribution

Logs

  • Query requests with variables and complexity
  • Mutation requests with authorization context
  • DataLoader batch operations and cache misses
  • Error responses with path and locations
  • Schema change events
  • Subscription lifecycle events
  • Security events (introspection attempts, rate limit hits)

Alerts

  • Query complexity exceeds threshold
  • Error rate exceeds normal baseline
  • DataLoader cache hit ratio drops below 80%
  • Subscription count exceeds limits
  • Unusual introspection activity
  • Query depth spikes indicate potential attack

Security Checklist

  • Disable schema introspection in production
  • Implement query complexity limits and depth limits
  • Use persisted queries to prevent abuse
  • Validate and sanitize all variable inputs
  • Implement proper authorization at resolver level
  • Log and monitor unusual query patterns
  • Rate limit queries per client
  • Protect against query batching attacks (array literals)
  • Use query whitelisting for sensitive operations
  • Validate that mutations affect only intended fields
  • Implement request timeout at GraphQL layer
  • Do not expose internal error details in responses

Common Pitfalls / Anti-Patterns

Overusing GraphQL for Simple Cases

GraphQL adds complexity. Simple REST endpoints may be better.

# Overkill: GraphQL for simple, predictable data
query {
  healthCheck {
    status
  }
}

# Better: Simple REST endpoint
GET /health

Ignoring N+1 Queries

GraphQL makes N+1 problems easy to create.

# Problem: Fetches posts for each user separately
query {
  users {
    name
    posts {
      title
    } # N queries for N users
  }
}

# Better: Use DataLoader to batch
query {
  users {
    name
    posts {
      title
    } # Single batched query
  }
}

Not Implementing Proper Error Handling

GraphQL returns 200 OK even for errors.

// Problem: Error masked as success
{
  "data": { "user": null },
  "errors": [{ "message": "Not authorized" }]
}

// Better: Use proper HTTP status codes
{
  "errors": [{
    "extensions": { "code": "UNAUTHORIZED" }
  }]
}

Exposing Schema Internals

Introspection can reveal your entire schema.

// Disable introspection in production
const server = new ApolloServer({
  schema,
  introspection: false, // Production
  playground: false, // Production
});


Interview Questions

Fundamentals

1. What is the N+1 problem in GraphQL, and how does DataLoader solve it?

The N+1 problem occurs when fetching a list of items triggers separate requests for each item's related data. For example, querying 100 users and their posts makes 1 query for users plus 100 queries for posts.

DataLoader solves this by:

  • Batching: Collecting all requested IDs during query execution, then loading them in a single database query
  • Caching: Memoizing results within a request to avoid duplicate loads
  • Queuing: Using the loader in resolvers, which automatically coalesces requests
2. Explain the difference between mutations and queries in GraphQL. Why are mutations typically named with verb patterns (createUser, updatePost) while queries use noun patterns (user, posts)?

Queries are idempotent reads—they don't modify data. Mutations are operations that cause side effects and modify server state.

Naming conventions:

  • Queries as nouns (user, posts) — you "ask for" data
  • Mutations as verbs (createUser, updatePost) — you "command" an action

This convention makes it immediately clear in tooling and documentation which operations modify state.

3. Why does GraphQL return HTTP 200 OK even for errors? What are the trade-offs?

The reasoning is that HTTP and GraphQL live on separate layers. A query might partially succeed—some fields resolve, others don't. Returning 200 with both data and errors preserves that nuance.

It trips up traditional monitoring, though. Your HTTP-aware tools expect non-2xx for failures, not a 200 with an error body. And authorization errors masquerading as successes is genuinely unsettling.

Best practice: stick error codes in extensions so clients can actually detect failures.

Design & Architecture

4. How would you implement real-time updates in GraphQL? Compare subscriptions vs polling vs webhooks.

GraphQL Subscriptions use WebSockets for bidirectional communication. The server pushes data when relevant events occur.

Comparison:

  • Subscriptions: Instant push, long-lived connections, best for UI updates
  • Polling: Simple, HTTP-based, predictable load—good for infrequent updates
  • Webhooks: HTTP callbacks from server to client, standard for cross-service events

Most systems use all three for different use cases.

5. What is schema stitching and how does it differ from schema federation?

Schema stitching merges multiple schemas into one at a gateway layer. The gateway acts as a unified API surface.

Federation (Apollo) decomposes schema ownership—each subgraph owns its types and the gateway routes queries to the right subgraph. Federation is more scalable and team-friendly.

Stitching is older, more complex to maintain, and less commonly used in new projects.

6. Explain how persisted queries improve GraphQL security and performance.

Persisted queries replace full query strings with SHA-256 hashes sent from client to server.

Security: Only pre-registered queries execute—prevents query injection attacks and introspection abuse.

Performance: Queries are pre-validated at build time—no parsing or validation overhead at runtime. Network payload shrinks from full query to 64-character hash.

7. How does normalized caching in Apollo Client work, and when would you use it?

Apollo's normalized cache stores entities by their @id type, so updating a User:id:123 automatically updates all queries that reference that user. This prevents cache inconsistency.

Use normalized cache when:

  • Multiple components show the same entity
  • Mutations update entities that appear in cached queries
  • You want automatic cache updates without manual refetchQueries
8. Describe the trade-offs between GraphQL and REST for a public API consumed by many third-party developers.

REST wins for public APIs because HTTP caching is built in, debugging is straightforward, and consumers can rate-limit per endpoint. GraphQL gives clients flexibility to fetch exactly what they need with a self-documenting schema, but that flexibility comes at a cost: no standard caching, steeper learning curve, harder to debug.

For diverse third-party consumers, REST is usually the safer bet. For internal tools or complex client-driven UIs, GraphQL pays off.

Advanced & Production

9. How would you secure a GraphQL API against query complexity attacks?

Stack several approaches:

  • Query depth limits stop deeply nested attacks
  • Complexity analysis assigns costs to fields—reject anything too expensive
  • Persisted queries mean only pre-registered operations run
  • Rate limiting per client prevents abuse
  • Timeouts kill queries that run too long
  • Turn off introspection in production so attackers can't enumerate your schema
10. What strategies would you use for GraphQL caching at the CDN and browser level?

CDN caching breaks with GraphQL because POST bodies aren't URL-keyed. Your options:

  • Persisted queries as GET requests—hash goes in the URL, CDN can cache
  • Response caching keyed by operation name plus variables hash
  • Some CDNs (Cloudflare, for example) can fingerprint GraphQL requests and cache accordingly

On the browser, Apollo or Relay handle client-side caching. Service workers add offline support. Normalized cache keeps shared entity data consistent across components.

11. How does DataLoader handle cache misses and partial failures in batches?

DataLoader queues up requests during execution, then fires one batched query when the loader runs. Each key either hits cache or misses and gets batch-loaded. Results map back to the original key order.

Partial failures are interesting—the batch function can return Error objects alongside successful results. DataLoader propagates the Error to whichever field requested the failed key, while everything else succeeds. Nice for granular error handling.

12. Explain when you would choose cursor-based pagination vs offset-based pagination in GraphQL.

Offset pagination (skip/limit) is dead simple but doesn't scale. Page N means scanning N times the page size. Add a row and everything shifts.

Cursors use stable markers. Page 2 after page 1 stays consistent even with concurrent writes. This is why infinite scroll and social feeds use cursors.

Use offset for small admin UIs where jumping to page 5 directly is valuable. Use cursors for anything user-facing with large datasets.

13. What are the key considerations when migrating from REST to GraphQL in a large organization?

A few things trip people up:

  • Teams need time to internalize GraphQL thinking—it's not just another REST
  • Your existing tooling (CI/CD, monitoring, API gateways) probably needs updates
  • Don't rewrite everything at once—run GraphQL alongside REST and migrate domain by domain
  • Schema ownership gets thorny fast: who can modify what? Set clear boundaries early
  • N+1 problems will surface. DataLoader isn't optional.
  • Security surface is different—introspection attacks, query complexity, different rate-limiting vectors
14. How would you handle file uploads in a GraphQL API?

GraphQL spec says nothing about files. You've got options:

  • Multipart requests via graphql-upload library
  • Base64 encoding (terrible—bloats the payload)
  • Pre-signed URLs (S3, GCS, whatever)—upload goes direct, you just pass the URL to GraphQL
  • REST for uploads, GraphQL for the mutation that references the result

Pre-signed URLs scale best. The heavy lifting happens outside your server.

15. Describe how you would implement authorization at the field level in GraphQL.

Do it in resolvers:

const resolvers = {
  User: {
    email: (user, args, context) => {
      if (context.user.id !== user.id && !context.user.isAdmin) {
        return null;
      }
      return user.email;
    },
  },
};

Always check at resolver level, never trust the client to filter. Return null for unauthorized fields instead of throwing—it gives you partial data rather than killing the whole query.

16. What is the relationship between GraphQL subscriptions and WebSockets? How do you handle subscription scaling in a distributed environment?

Subscriptions are built on WebSockets for bidirectional, long-lived connections. The client opens a WebSocket, sends a subscription query, and the server pushes updates when events occur.

For distributed scaling (multiple GraphQL servers):

  • Use Redis Pub/Sub or RabbitMQ to broadcast events across server instances
  • Each server subscribes to the message broker and forwards relevant events to its connected clients
  • Connection state (subscribed channels per client) must be tracked and cleaned up on disconnect
17. How does GraphQL handle authentication and authorization at the API level?

GraphQL sits on top of HTTP, so authentication is typically HTTP-based:

  • JWT tokens in Authorization header—validated in middleware before GraphQL execution
  • Cookies for web apps; tokens for mobile/native clients
  • Authorization happens in resolvers—check context.user permissions per field

Field-level authorization: return null for unauthorized fields rather than throwing, so partial data still returns.

18. What are the advantages and disadvantages of schema-first vs code-first GraphQL development?

Schema-first: you write SDL first, then generate resolvers. Pros: clear contract, easy review, self-documenting. Cons: resolver code may drift from schema.

Code-first: you define types in code (using decorators, classes, or builder patterns) and schema is generated from them. Pros: type safety in your language, less duplication. Cons: schema is derived, not explicit.

Most teams end up with a hybrid—they write SDL for complex types but generate parts programmatically.

19. Explain the purpose and behavior of the `@deprecated` directive in GraphQL. How does it support schema evolution?

The @deprecated directive marks fields or enum values as obsolete:

type User {
  name: String!
  fullName: String @deprecated(reason: "Use 'name' instead")
  role: UserRole!
}

Clients introspection can show deprecated fields with warnings. This lets you:

  • Remove fields gradually—old clients see warnings, new clients use the replacement
  • Communicate breaking changes without immediately cutting off old clients
  • Keep schema backward compatible while guiding migration
20. How would you approach testing a GraphQL API? What tools and strategies do you use?

Test at multiple layers:

  • Unit tests: resolver logic with mock context
  • Integration tests: execute full queries against a test database
  • Schema tests: validate that schema transformations produce expected output

Tools:

  • graphql-testing-library for integration tests
  • @graphql-tools/mock for schema mocking
  • Apollo Server test mode for end-to-end execution
  • Load testing with k6 or artillery for performance

Further Reading

Official Documentation

Schema Design

Performance

Security


Conclusion

REST and GraphQL make different trade-offs. REST is resource-oriented with fixed endpoints and built-in HTTP caching. GraphQL is query-oriented with flexible data fetching and strong typing.

REST works well for simple, predictable APIs where caching matters. GraphQL works well for complex, client-driven data requirements where you want to avoid overfetching. Neither is universally better.

For REST API design, see the RESTful API Design post. For versioning both types of APIs, see the API Versioning Strategies post.

Category

Related Posts

RESTful API Design: Best Practices for Building Web APIs

Learn REST principles, resource naming, HTTP methods, status codes, and best practices. Design clean, maintainable, and scalable RESTful APIs.

#api #rest #architecture

API Versioning: Managing Change Without Breaking Clients

Learn API versioning strategies: URL path, header, and query parameter approaches. Understand backward compatibility, deprecation practices, and migration patterns.

#api #versioning #rest

Rate Limiting: Token Bucket, Sliding Window, and Distributed Systems

Rate limiting protects APIs from abuse. Learn token bucket, sliding window, fixed window algorithms and distributed rate limiting at scale.

#api #rate-limiting #architecture