Server-Sent Events in Go: A Production Deep Dive

Why Server-Sent Events Are Back in 2026

Server-Sent Events were the awkward middle child of real-time web tech for a decade. WebSockets got the spotlight, and SSE quietly waited. In 2026, with the htmx renaissance, AI token streaming everywhere, and HTTP/2 multiplexing finally everywhere in production, SSE is suddenly the right tool for an enormous class of problems: server-to-client push that does not need full-duplex.

The promise is simple. One unidirectional HTTP connection. Plain text frames separated by blank lines. Automatic reconnect baked into the browser. Works through every proxy you already trust. The catch is that the toy examples all over the internet skip the parts that matter in production: backpressure when a client falls behind, clean reconnection with Last-Event-ID, fan-out across thousands of subscribers, graceful shutdown, and horizontal scaling.

This guide builds an SSE service in Go from a 30-line handler up to a multi-instance Redis-backed broker. By the end you will know how to ship something you can put in front of real users without paging yourself at 3 a.m.

Prerequisites

Go 1.22 or newer (we will use the new http.ServeMux patterns)
Comfort with goroutines, channels, and context.Context
A reverse proxy or load balancer that does not buffer responses (Nginx, Caddy, Cloudflare, Traefik)
Optional: a local Redis 7 instance for the horizontal-scale section

Step 1: A Minimal SSE Handler That Will Bite You

The smallest correct SSE handler in Go is about 30 lines. Most tutorials stop here. We will start here, then fix the real problems one by one.

package main

import (
    "fmt"
    "net/http"
    "time"
)

func sseHandler(w http.ResponseWriter, r *http.Request) {
    // Tell intermediaries this is an event stream.
    w.Header().Set("Content-Type", "text/event-stream")
    w.Header().Set("Cache-Control", "no-cache")
    w.Header().Set("Connection", "keep-alive")

    flusher, ok := w.(http.Flusher)
    if !ok {
        http.Error(w, "streaming unsupported", http.StatusInternalServerError)
        return
    }

    ticker := time.NewTicker(1 * time.Second)
    defer ticker.Stop()

    for {
        select {
        case <-r.Context().Done():
            return
        case t := <-ticker.C:
            fmt.Fprintf(w, "data: %s\n\n", t.Format(time.RFC3339))
            flusher.Flush()
        }
    }
}

func main() {
    mux := http.NewServeMux()
    mux.HandleFunc("GET /events", sseHandler)
    http.ListenAndServe(":8080", mux)
}

Open it in your browser with a tiny client and you will see ticks every second:

<script>
  const es = new EventSource("/events");
  es.onmessage = e => console.log("tick:", e.data);
</script>

This works. It is also a trap. Three problems lurk inside it. The handler will keep running on a stale connection if a proxy holds the request open. There is no id: field, so the browser cannot tell the server where to resume after a drop. And every connected client gets the same goroutine that produces its own data, so 10,000 subscribers means 10,000 ticker goroutines doing identical work.

Step 2: Heartbeats and Last-Event-ID for Honest Reconnects

EventSource reconnects automatically when the connection drops. By default it waits about three seconds, then issues a new GET. If the server included an id: field on previous events, the browser sends the most recent one back as the Last-Event-ID request header. Your job is to read it and resume.

Two things make reconnects work in the wild. First, send a periodic comment line as a heartbeat so idle connections do not get killed by intermediaries (the typical idle timeout is 60 to 120 seconds). Second, persist enough recent history to replay events from Last-Event-ID forward.

type Event struct {
    ID   uint64
    Name string // optional event type
    Data string
}

func writeEvent(w http.ResponseWriter, f http.Flusher, ev Event) error {
    if ev.Name != "" {
        if _, err := fmt.Fprintf(w, "event: %s\n", ev.Name); err != nil {
            return err
        }
    }
    if _, err := fmt.Fprintf(w, "id: %d\n", ev.ID); err != nil {
        return err
    }
    // Split multi-line data into multiple data: lines per the spec.
    for _, line := range strings.Split(ev.Data, "\n") {
        if _, err := fmt.Fprintf(w, "data: %s\n", line); err != nil {
            return err
        }
    }
    if _, err := fmt.Fprint(w, "\n"); err != nil {
        return err
    }
    f.Flush()
    return nil
}

func writeHeartbeat(w http.ResponseWriter, f http.Flusher) error {
    // A comment line. Browsers ignore it but it keeps the socket warm.
    if _, err := fmt.Fprint(w, ": ping\n\n"); err != nil {
        return err
    }
    f.Flush()
    return nil
}

Now wire those into the handler with a 20-second heartbeat and a Last-Event-ID resume hook:

func (s *Server) sseHandler(w http.ResponseWriter, r *http.Request) {
    w.Header().Set("Content-Type", "text/event-stream")
    w.Header().Set("Cache-Control", "no-cache")
    w.Header().Set("X-Accel-Buffering", "no") // tell Nginx not to buffer

    f := w.(http.Flusher)

    var resumeFrom uint64
    if h := r.Header.Get("Last-Event-ID"); h != "" {
        if v, err := strconv.ParseUint(h, 10, 64); err == nil {
            resumeFrom = v
        }
    }

    // Replay missed events first.
    if missed := s.history.Since(resumeFrom); len(missed) > 0 {
        for _, ev := range missed {
            if err := writeEvent(w, f, ev); err != nil { return }
        }
    }

    sub := s.hub.Subscribe()
    defer s.hub.Unsubscribe(sub)

    heartbeat := time.NewTicker(20 * time.Second)
    defer heartbeat.Stop()

    for {
        select {
        case <-r.Context().Done():
            return
        case <-heartbeat.C:
            if err := writeHeartbeat(w, f); err != nil { return }
        case ev, ok := <-sub.C:
            if !ok { return }
            if err := writeEvent(w, f, ev); err != nil { return }
        }
    }
}

The X-Accel-Buffering: no header is non-negotiable behind Nginx. Without it, Nginx buffers the response and your stream becomes a pile of 4 KB chunks that arrive seconds late. Cloudflare and modern Caddy obey the spec without coaxing, but the header is harmless either way.

Step 3: A Hub for Fan-Out Without 10,000 Tickers

The naive handler made every subscriber compute its own data. The fix is the classic broker pattern: one producer, many subscribers. Producers push events into the hub once. The hub fans them out to every active subscriber.

type Subscriber struct {
    C chan Event
}

type Hub struct {
    mu   sync.RWMutex
    subs map[*Subscriber]struct{}
}

func NewHub() *Hub {
    return &Hub{subs: make(map[*Subscriber]struct{})}
}

func (h *Hub) Subscribe() *Subscriber {
    s := &Subscriber{C: make(chan Event, 64)} // bounded buffer per client
    h.mu.Lock()
    h.subs[s] = struct{}{}
    h.mu.Unlock()
    return s
}

func (h *Hub) Unsubscribe(s *Subscriber) {
    h.mu.Lock()
    if _, ok := h.subs[s]; ok {
        delete(h.subs, s)
        close(s.C)
    }
    h.mu.Unlock()
}

func (h *Hub) Publish(ev Event) {
    h.mu.RLock()
    defer h.mu.RUnlock()
    for s := range h.subs {
        select {
        case s.C <- ev:
            // delivered
        default:
            // subscriber is slow; drop or disconnect (see Step 4)
        }
    }
}

Two design choices here are load-bearing. The per-subscriber channel is bounded. We picked 64 because it is roughly one second of events at 60 Hz, which is more than any human-facing UI needs. Bounded channels turn slow clients into a local problem instead of a heap explosion. And the Publish path uses a non-blocking send. A slow subscriber will never stall the producer or block any other subscriber.

Step 4: Backpressure That Will Not Take Down Your Server

In the broker above, what happens when a subscriber is slow? The non-blocking send drops the event silently. That is fine for a stock ticker where the next tick supersedes the last one. It is wrong for a chat app or anything order-sensitive. You need a policy.

There are three sane policies for SSE backpressure, and the right answer depends on your domain.

Drop oldest. Replace the oldest queued event with the new one. Good for tickers, telemetry, presence. Cheap and stays bounded.
Disconnect slow clients. If the buffer fills, kick the subscriber. Their browser will reconnect with Last-Event-ID and replay from history. Good when you have an event log to resume from.
Block the producer. Almost always wrong for a fan-out hub. One slow client must not pause the rest. Only acceptable for single-subscriber streams.

Here is a hub with the disconnect-slow-clients policy. It is the policy you want for AI token streaming or chat: order matters, and the client can resume.

func (h *Hub) Publish(ev Event) {
    h.mu.RLock()
    var slow []*Subscriber
    for s := range h.subs {
        select {
        case s.C <- ev:
        default:
            slow = append(slow, s)
        }
    }
    h.mu.RUnlock()

    // Disconnect slow clients outside the read lock.
    for _, s := range slow {
        h.Unsubscribe(s)
    }
}

Now the dropped subscriber's handler sees sub.C close and returns. The browser reconnects, the handler reads Last-Event-ID, the history buffer replays missed events, and the user sees a clean recovery instead of silently missing data.

A Ring Buffer for History

Replays need an event log. You do not need a full database; an in-memory ring buffer covers the common case where reconnects happen within seconds, and that is 99 percent of disconnects.

type History struct {
    mu  sync.Mutex
    buf []Event
    cap int
}

func NewHistory(cap int) *History {
    return &History{cap: cap}
}

func (h *History) Append(ev Event) {
    h.mu.Lock()
    defer h.mu.Unlock()
    h.buf = append(h.buf, ev)
    if len(h.buf) > h.cap {
        h.buf = h.buf[len(h.buf)-h.cap:]
    }
}

func (h *History) Since(id uint64) []Event {
    h.mu.Lock()
    defer h.mu.Unlock()
    out := make([]Event, 0)
    for _, ev := range h.buf {
        if ev.ID > id {
            out = append(out, ev)
        }
    }
    return out
}

A 1,000-event ring buffer at 64 bytes per event is 64 KB. You can run many independent topics at this scale on a small box. If your event volume is higher, partition by topic key and keep one ring per topic.

Step 5: Graceful Shutdown Without Cutting Off Mid-Event

SSE connections are long-lived. A naive os.Exit rips them mid-frame. A naive http.Server.Shutdown waits for them to drain, which never happens by definition. The right pattern is to close the hub first, which signals every subscriber, then call Shutdown with a short timeout to drain the in-flight writes.

func (s *Server) Run(ctx context.Context) error {
    srv := &http.Server{Addr: ":8080", Handler: s.routes()}

    go func() {
        <-ctx.Done()
        // Tell every subscriber the party is over.
        s.hub.CloseAll()

        shutCtx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
        defer cancel()
        _ = srv.Shutdown(shutCtx)
    }()

    return srv.ListenAndServe()
}

func (h *Hub) CloseAll() {
    h.mu.Lock()
    defer h.mu.Unlock()
    for s := range h.subs {
        close(s.C)
    }
    h.subs = nil
}

Now SIGTERM produces a clean exit. Each handler exits its select on the closed channel, sends nothing more, and the HTTP server drains the writes within five seconds. Your load balancer sees graceful disconnects and clients reconnect to the new instance.

Step 6: Scaling Horizontally with Redis Pub/Sub

One process can comfortably hold 10,000 to 50,000 SSE connections on modern hardware (Linux ulimit permitting). Past that, or for high availability, you need multiple instances behind a load balancer. The challenge is that an event published on instance A must reach subscribers on instance B and C.

Redis Pub/Sub is the cheapest answer that works. NATS, Kafka, and Postgres LISTEN/NOTIFY all fit too; pick what your team already runs. The pattern is identical: each instance has a local hub, plus a Redis subscriber that re-publishes received messages into the local hub.

type DistributedHub struct {
    *Hub
    rdb     *redis.Client
    channel string
}

func (d *DistributedHub) Run(ctx context.Context) error {
    sub := d.rdb.Subscribe(ctx, d.channel)
    defer sub.Close()

    ch := sub.Channel()
    for {
        select {
        case <-ctx.Done():
            return ctx.Err()
        case msg := <-ch:
            var ev Event
            if err := json.Unmarshal([]byte(msg.Payload), &ev); err != nil {
                continue
            }
            d.Hub.Publish(ev) // fan out to local subscribers only
        }
    }
}

func (d *DistributedHub) Publish(ctx context.Context, ev Event) error {
    payload, err := json.Marshal(ev)
    if err != nil { return err }
    // Publish only to Redis. The local Run loop will pick it up
    // and fan out, including to this instance's own subscribers.
    return d.rdb.Publish(ctx, d.channel, payload).Err()
}

A subtle point: producers do not call the local hub directly. They publish to Redis, and the Redis subscription loop fans out to local subscribers. This guarantees that every instance, including the one that originated the event, sees events in the same order as everyone else. Without this discipline you get inconsistent ordering between instances and very confused users.

For at-least-once semantics you need Redis Streams instead of Pub/Sub, plus per-subscriber consumer groups. The structure is the same; the implementation is one extra page of code and a different set of trade-offs around lag and memory. Start with Pub/Sub; graduate to Streams when you have a reason.

Gotchas the Toy Examples Skip

1. The 6-connection per-origin limit in HTTP/1.1

Browsers cap parallel HTTP/1.1 connections per origin at six. If you serve assets and the SSE stream from the same hostname over HTTP/1.1, every open tab steals one slot, and a user with seven tabs sees the last one hang. Fix: serve over HTTP/2 (which multiplexes), or put SSE on a dedicated subdomain.

2. Compression breaks streaming

Most reverse proxies enable gzip by default. Gzip needs to buffer to compress effectively, which defeats the entire point of SSE. Disable compression on the SSE route specifically. In Nginx: gzip off; inside the location block. In Caddy: omit encode for the SSE matcher.

3. Browser tabs in the background

Modern browsers throttle background tabs aggressively. Your heartbeat will still arrive, but rendering may pause. Do not rely on client-side timeouts to detect a dead connection; rely on a missing heartbeat on the server side, or build a visibilitychange handler that reconnects on focus.

4. CORS preflight on EventSource

EventSource does not send custom headers, but it does send Origin. If your API is on a different origin, you need Access-Control-Allow-Origin on the response. EventSource cannot send Cookie across origins unless you set withCredentials: true on the constructor and respond with Access-Control-Allow-Credentials: true. Most teams put SSE on the same origin to dodge this entirely.

5. Forgetting to call Flush()

Without Flusher.Flush(), the net/http response writer buffers up to 4 KB before flushing. Your single 30-byte event sits in a buffer until the response closes. The browser sees nothing. Always flush after every event and every heartbeat.

6. Mobile networks and aggressive NAT

Carrier-grade NAT often kills idle connections after 60 seconds. A 20-second heartbeat survives this. A 90-second heartbeat does not. If you support mobile, keep the heartbeat under 30 seconds.

Quick Reference

Concern	Recommendation
Heartbeat interval	15 to 25 seconds, sent as `: ping` comment
Per-subscriber buffer	32 to 128 events; bounded, never unlimited
Backpressure policy	Disconnect slow clients; rely on `Last-Event-ID` replay
History size	1,000 events per topic for typical workloads
Reverse proxy	Set `X-Accel-Buffering: no`; disable gzip on SSE route
Shutdown	Close hub first, then `http.Server.Shutdown` with 5s timeout
Horizontal scale	Redis Pub/Sub for at-most-once; Streams for at-least-once
Browser limits	Serve over HTTP/2, or use a dedicated SSE subdomain
Connection cap per box	10k to 50k on Linux with raised `ulimit -n`

Next Steps

You have everything you need to ship. To take this further, three directions are worth investigating depending on what you are building. For AI token streaming, swap the broker for a per-request channel: each chat request gets its own stream, no fan-out, no history. For massively multiplexed dashboards (think a per-user feed), shard by user ID across hubs and route by sticky session. For at-least-once delivery and audit trails, replace the in-memory history with a Postgres table partitioned by topic and use a dedicated worker to prune.

And once you have it running, instrument it. The metrics that actually catch incidents are: connected subscriber count per instance, events published per second, events dropped per second (your slow-client signal), and heartbeat write latency p99. Page on dropped events climbing or heartbeat latency climbing; ignore the rest.

SSE will not replace WebSockets, and it should not. But for the 80 percent of real-time use cases where the server does the talking, it is simpler, cheaper, and more honest about its failure modes than the alternatives. Now you know how to make it boring in production.