Streaming SSR: Progressive HTML Streaming
Learn the interview-ready mental model, practical trade-offs, and production patterns for this web fundamentals topic.
Topic content
Streaming SSR allows the server to send HTML progressively as different parts of the page become ready, instead of waiting for the entire render to complete. Built around React Suspense boundaries, it dramatically improves perceived performance by showing useful content (shell + early sections) before slow dependencies finish.
Traditional SSR = kitchen waits until every dish is ready before bringing anything out (slowest item blocks everything). Streaming SSR = server sends the appetizers and drinks immediately, then brings main courses as they finish cooking. The user starts enjoying the meal much sooner even if the full dinner takes the same total time.
1How Streaming SSR Works
The server renders the page shell immediately and streams it. When it hits a Suspense boundary, it sends a fallback and continues rendering other parts. Once suspended data resolves, the completed chunk streams in and replaces the fallback.
<Suspense fallback={<ProductSkeleton />}>
<ProductList />
</Suspense>2Suspense Boundaries & Parallelism
Suspense acts as both loading UI and streaming boundary. Independent boundaries allow parallel data fetching and progressive delivery. Good boundaries separate critical shell from secondary slow content.
3React Server Components + Streaming
In Next.js App Router, Server Components naturally support streaming. Combine with Suspense for optimal results. Edge runtime further reduces latency by running closer to users.
4Traditional SSR vs Streaming SSR
Traditional waits for everything → one big response. Streaming delivers early shell + progressive chunks. Hydration and JS cost remain similar in both cases.
- ✓Streaming SSR delivers HTML progressively instead of all at once
- ✓Suspense boundaries define natural streaming points
- ✓Shell + critical content should render outside slow boundaries
- ✓Greatly improves perceived performance on pages with mixed latency
- ✓Works best with Server Components and Edge runtimes
- ✓Hydration cost remains — streaming helps HTML arrival, not interactivity
- ✓Always measure real user experience, not just backend render time