GraphQL Schema & N+1 Problem

⏱️ ~4-minute bite · solve the sandbox to master

0%lesson

🧒

5-Year-Old Metaphor

— The physical, real-world picture. No jargon.

📚 N+1 = calling the librarian 10 times for 10 authors. DataLoader = giving the librarian a list and getting all 10 at once.

N+1 — the naive resolver

You ask: "Give me 10 posts with author names." The resolver fetches 10 posts (1 query), then calls the librarian once per post: "Who wrote post 1? Post 2? Post 3?..." — 10 separate trips to the shelf.

SELECT * FROM posts LIMIT 10; -- 1

SELECT * FROM users WHERE id=1; -- 2

SELECT * FROM users WHERE id=2; -- 3

... × 10 = 11 queries total

DataLoader — the smart librarian

DataLoader waits for all 10 author requests to arrive in the same tick, then goes to the shelf once with a list of 10 IDs: "Get me all of these." One trip, same result.

SELECT * FROM posts LIMIT 10; -- 1

SELECT * FROM users

WHERE id IN (1,2,...,10); -- 2

2 queries total — same data!

GraphQL vs REST — the real comparison

Concern	REST	GraphQL
Over-fetching	Always (fields fixed by server)	Never (client specifies fields)
Under-fetching	Multiple endpoints (N+1 in REST too)	One query, nested data
Versioning	URL versions (/v1, /v2)	@deprecated + schema evolution
Type safety	OpenAPI spec (optional)	Schema is the type system
Caching	URL-based (CDN-friendly)	POST-based (no HTTP cache by default)
N+1 risk	In REST clients (waterfall fetching)	In resolvers (requires DataLoader)

🎛️

Interactive Sandbox

— Move something, see it react instantly.

Pattern

GraphQL schema

1	# Query type — reads
2	type Query {
3	posts(first: Int, after: String): PostConnection!
4	post(id: ID!): Post
5	user(id: ID!): User
6	}
7
8	# Mutation type — writes
9	type Mutation {
10	createPost(input: CreatePostInput!): Post!
11	updatePost(id: ID!, input: UpdatePostInput!): Post!
12	deletePost(id: ID!): Boolean!
13	}
14
15	# Subscription type — real-time
16	type Subscription {
17	newPost: Post!
18	postUpdated(id: ID!): Post!
19	}
20
21	# Types
22	type Post {
23	id: ID!
24	title: String!
25	author: User! # nested resolver → N+1 risk
26	createdAt: DateTime!
27	}
28
29	type User {
30	id: ID!
31	name: String!
32	email: String!
33	posts: [Post!]!
34	}

Pipeline steps

Schema-first: SDL written first, resolvers implement it

Code-first: types in TS → SDL generated (Pothos, TypeGraphQL)

The schema is the contract. Breaking changes (removing fields, changing types) require versioning or field deprecation. Use @deprecated directive, never just delete fields.

Gotcha: Not limiting query depth allows deeply nested queries like { posts { author { posts { author { posts ... } } } } }. Without depth limiting, one query can fetch millions of rows. Use graphql-depth-limit or complexity analysis.

Insight: Schema-first keeps the API contract visible — the SDL file is the documentation. Code-first keeps schema and resolvers in sync automatically — TypeScript types are the truth. Both are valid; pick based on team preference.

Explored:📐🔥📦📄📡

🎯

Challenge

Compare N+1 (before) vs DataLoader (after). Check the query count badge — it should drop from 11 to 2.

Try it

🎯

Why Should I Care?

— The exact interview question + the bug it kills.

Interview questions

Q: How does DataLoader batch and cache?

1	// DataLoader internals — simplified
2	class DataLoader<K, V> {
3	private queue: Array<{key: K; resolve: (v: V) => void}> = [];
4	private cache = new Map<K, Promise<V>>();
5
6	load(key: K): Promise<V> {
7	if (this.cache.has(key)) return this.cache.get(key)!;
8
9	const promise = new Promise<V>((resolve) => {
10	this.queue.push({ key, resolve });
11	});
12	this.cache.set(key, promise);
13
14	// Schedule batch at end of current tick
15	if (this.queue.length === 1) {
16	Promise.resolve().then(() => this.flush());
17	}
18
19	return promise;
20	}
21
22	private async flush() {
23	const batch = this.queue.splice(0); // drain queue
24	const keys = batch.map(b => b.key);
25	const values = await this.batchFn(keys); // ONE DB call
26	batch.forEach((b, i) => b.resolve(values[i]));
27	}
28	}

Q: Cursor-based vs offset-based pagination — when to use each?

Offset: Simple SQL (LIMIT x OFFSET y). Works for static datasets. Breaks when items are inserted or deleted mid-page (duplicates/gaps). Performance degrades at large offsets (O(n) scan). Good for: admin tables, search results, any paginated list where real-time updates don't matter.

Cursor: Encodes the last-seen position (usually base64(id)). Stable under real-time updates. O(log n) via index seek. Can't jump to page N. Good for: social feeds, real-time lists, infinite scroll, anything that updates while the user reads.

Q: Schema-first vs code-first GraphQL — tradeoffs?

Schema-first: SDL written in .graphql files. Resolvers generated or written separately. The SDL is the documentation and contract — visible to all teams. Risk: schema and resolvers can drift. Tools: graphql-code-generator generates TypeScript types from SDL.

Code-first: Types defined in TypeScript (Pothos, TypeGraphQL, NestJS GraphQL). SDL generated automatically. Resolver and type always in sync — TypeScript is the truth. Risk: SDL is generated output, less visible. Better for TypeScript-first teams.

Bug: not limiting query depth — DoS vector

1	# ✗ DANGER: deeply nested query fetches millions of rows
2	query {
3	posts {
4	author {
5	posts {
6	author {
7	posts { # unbounded nesting
8	author { ... } # exponential DB load
9	}
10	}
11	}
12	}
13	}
14	}
15
16	# ✓ FIX: depth limiting
17	import { createComplexityRule } from 'graphql-query-complexity';
18	const server = new ApolloServer({
19	validationRules: [
20	depthLimit(5), // max nesting depth
21	createComplexityRule({ maximumComplexity: 1000 }),
22	],
23	});

🔬

The Deep Dive

— Spec refs, engine internals, the minutiae.

Persisted queries for security and caching

Persisted queries replace the full GraphQL query string with a hash. Clients send only the hash. Unknown hashes are rejected. Benefits: (1) prevents arbitrary queries from reaching production; (2) makes queries CDN-cacheable (GET with hash as query param).

1	// Client sends hash instead of full query
2	GET /graphql?extensions={"persistedQuery":{"version":1,"sha256Hash":"abc123"}}
3
4	// Server: if hash unknown → client sends full query
5	GET /graphql?query={posts{id,title}}&extensions={"persistedQuery":{"sha256Hash":"abc123"}}
6
7	// Server caches: hash → query string
8	// Next request: just the hash — no query body needed
9	// CDN can cache GET requests! (impossible with POST queries)

Apollo vs Yoga vs Pothos

Apollo Server

Most widely used. Plugin ecosystem, Federation for microservices. Heavy — 15MB bundle. Overkill for simple APIs.

Use for: Enterprise, microservice federation

GraphQL Yoga

Lightweight (~2MB), built on Fetch API, works in Edge runtime (Cloudflare Workers). Code-first optional. Great default.

Use for: Modern stacks, Edge runtime

Pothos (schema builder)

Code-first schema builder in TypeScript. Plugin for Prisma integration. Best TypeScript type safety. No SDL files.

Use for: TypeScript-first, Prisma users

GraphQL caching challenges

GraphQL's biggest caching weakness: queries are POST requests — not cacheable by HTTP caches or CDNs by default. Solutions: (1) persisted queries via GET; (2) client-side normalized cache (Apollo Client, urql); (3) partial cache invalidation via cache tags.

1	// Apollo Client: normalized in-memory cache
2	// Objects cached by __typename + id
3	const client = new ApolloClient({
4	cache: new InMemoryCache(),
5	});
6
7	// Query fetches Post:1 and User:5
8	// Cache stores { 'Post:1': {...}, 'User:5': {...} }
9	// If mutation updates User:5, all queries that use User:5 re-render
10	// Automatic cache invalidation for known entities
11
12	// Manual cache update after mutation:
13	cache.writeQuery({
14	query: GET_POST,
15	data: { post: updatedPost },
16	variables: { id: '1' },
17	});

🎤

Interview Questions

— Real questions from real interviews — with answers.

Cursor pagination for the feed; subscriptions for new posts; DataLoader for all nested resolvers.

Resolver fires per field per parent; DataLoader batches all calls within one event loop tick.

Apply depth limits and complexity scoring; reject queries above thresholds at validation time.

Subscriptions for collaborative/multi-user events; SSE for one-way feeds; polling for infrequent status.

Apollo Federation: each team owns a subgraph; the gateway composes them into one unified schema.

Schema-first: SDL is the visible contract; code-first: TypeScript is the source of truth with no drift.

🎮

Memory Game

— Quick quiz — lock the concept in long-term memory.

1/4

What is the Apollo Router used for in a federation architecture?