Turso Embedded Replicas: Local-First SQLite Done Right

Your Postgres sits behind a network. Every SELECT hops a router, traverses TLS, waits for the planner, and finally trickles bytes back. Most apps spend more milliseconds on round trips than on real work. In 2026 a different model has gone mainstream: keep the database inside your application process, and let a managed cloud handle durability, replication, and multi-region distribution behind the scenes.

That is the promise of Turso embedded replicas. You point your app at a local file, point that local file at a Turso Cloud database, and reads now finish in microseconds while writes still durably hit the primary. This guide walks through how embedded replicas work, when to use them, how to wire them into a Node.js or TypeScript service, and the gotchas that bite people in production.

What you will learn

How embedded replicas differ from connection pools, edge caches, and traditional read replicas
How to set up a libSQL client with sync intervals, manual sync, and read-your-writes
Patterns for deploys, ephemeral containers, and serverless functions
Consistency trade-offs and how to reason about staleness
Production gotchas: sync storms, disk usage, schema migrations, and observability

Prerequisites

Node.js 20 or newer (the examples use TypeScript, but plain JS works too)
A free Turso account and the Turso CLI installed
Comfort with basic SQL and SQLite's constraints (single-writer, file-based)
About 30 minutes

What an embedded replica actually is

A traditional read replica is a separate database server somewhere on your network. It catches writes from a primary, exposes its own connection endpoint, and you treat it like any other database. An embedded replica is the same idea pushed further: the replica is a SQLite file inside your application process. Your app opens it like a local SQLite database, but a background sync task pulls fresh frames from the primary on Turso Cloud.

Three properties fall out of that arrangement:

Reads are local file I/O. No network, no TLS, no planner round trip. Sub-millisecond latency is the floor, not the ceiling.
Writes still go to the primary. The client transparently forwards INSERT, UPDATE, and DELETE to Turso Cloud and waits for durable acknowledgement.
Sync is incremental. The replica pulls only WAL frames it does not have, so a long-running container does not re-download the whole database on every refresh.

If you have ever thought, "I wish my Postgres ran on the same machine as my web server," this is the shape of that wish, but with a managed primary that handles backups, point-in-time restore, and multi-region replication for you.

When to use embedded replicas

Embedded replicas shine in three workloads:

Read-heavy services with predictable schemas. Catalog browsing, feature flag lookups, content sites, dashboards. Anywhere reads outnumber writes 50 to 1, latency drops are immediate and large.
Multi-region apps that hate global routing. Each instance keeps its own replica, so a Singapore container reads from a Singapore SQLite file. There is no routing logic to write.
Local-first products and AI agents. Each agent or each user can have a private database that syncs back to the cloud, which is the data pattern many AI infrastructure teams have converged on this year.

They are a poor fit when:

Your workload is overwhelmingly write heavy. SQLite is single-writer at the file level, and even though Turso forwards writes to the primary, your application gains nothing on the write path.
You need cross-row transactional reads to be perfectly fresh. Embedded replicas are eventually consistent by default. You can force a sync before a critical read, but if you need that on every request, you have given back most of the latency win.
You need rich Postgres features like JSONB operators, full SQL window functions across complex queries, or extensions like PostGIS. SQLite has come a long way, but it is still SQLite.

How sync actually works

Under the hood, libSQL extends SQLite's WAL with a frame protocol. The client tracks the highest frame index it has applied locally. When it asks the server to sync, it sends that index and receives any frames that follow. The frames are applied to the local WAL atomically, so a reader either sees the old snapshot or the new snapshot, never a torn write.

Three knobs control sync behavior:

syncUrl and authToken: where the primary lives and how to authenticate. Without these, you have a normal local SQLite database with no replication.
syncInterval: how often a background task pulls new frames. Common values are 30 to 120 seconds. If you set it too low you generate noisy traffic, if you set it too high you read stale data.
client.sync(): a manual call that forces an immediate pull. Use it before reads that must reflect a recent write you made yourself.

Step 1: Create a Turso database

Install the Turso CLI, log in, and create a database:

# install (macOS / Linux)
curl -sSfL https://get.tur.so/install.sh | bash

# log in
turso auth login

# create a database in the closest region
turso db create cb-replicas-demo

# get the libSQL URL and a long-lived token
turso db show --url cb-replicas-demo
turso db tokens create cb-replicas-demo --expiration none

Save the URL and token. The URL looks like libsql://cb-replicas-demo-yourname.turso.io and the token is a long JWT. Put both in environment variables, never in source control.

Step 2: Wire up the libSQL client

Create a Node.js project and install the official client:

npm init -y
npm install @libsql/client dotenv
npm install -D typescript tsx @types/node
npx tsc --init

Create src/db.ts:

import "dotenv/config";
import { createClient, Client } from "@libsql/client";

const SYNC_INTERVAL_S = Number(process.env.TURSO_SYNC_INTERVAL ?? 60);

export const db: Client = createClient({
  url: "file:./data/local.db",       // local replica file
  syncUrl: process.env.TURSO_DATABASE_URL!, // primary on Turso Cloud
  authToken: process.env.TURSO_AUTH_TOKEN!,
  syncInterval: SYNC_INTERVAL_S * 1000,
});

// Pull the latest frames once on boot so cold starts read fresh data.
await db.sync();
console.log("local replica synced");

That single configuration object is the whole "edge replica" story. Pointing url at a file: path tells libSQL to use the embedded mode, and syncUrl turns it into a replica.

Step 3: Read locally, write through

Create a tiny example schema and exercise it:

// src/index.ts
import { db } from "./db.js";

await db.execute(`
  CREATE TABLE IF NOT EXISTS notes (
    id    INTEGER PRIMARY KEY AUTOINCREMENT,
    body  TEXT NOT NULL,
    created_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP
  );
`);

// write — forwarded to primary
await db.execute({
  sql: "INSERT INTO notes (body) VALUES (?)",
  args: ["hello from the embedded replica"],
});

// force a pull so we read the row we just wrote
await db.sync();

// read — local file I/O, microseconds
const result = await db.execute("SELECT id, body FROM notes ORDER BY id DESC LIMIT 5");
for (const row of result.rows) {
  console.log(row.id, row.body);
}

Run it twice. The first run forwards a write, syncs, and reads. The second run reads from the local file at SQLite speed and you will see both rows immediately. Compare that against the same workload talking to a remote Postgres and you will feel the difference in your terminal.

Example output

$ tsx src/index.ts
local replica synced
1 hello from the embedded replica

$ tsx src/index.ts
local replica synced
2 hello from the embedded replica
1 hello from the embedded replica

Step 4: Read-your-writes without staleness bugs

The number-one footgun with eventual consistency is reading a row you just wrote and not finding it. With embedded replicas you have three good patterns to handle that:

Force sync after a write when the next read in the same request flow depends on it. await db.sync() is cheap when there are no new frames.
Use the write's return value instead of re-querying. execute returns lastInsertRowid, rowsAffected, and you can RETURNING * to get the full row from the primary in a single round trip.
Pin reads after a write to the primary using the ?direct=true URL option for that one query, then drop back to the replica.

A clean wrapper looks like this:

// helper that guarantees you read your own writes
export async function createNote(body: string) {
  const res = await db.execute({
    sql: "INSERT INTO notes (body) VALUES (?) RETURNING id, body, created_at",
    args: [body],
  });
  return res.rows[0]; // includes the new id without a follow-up SELECT
}

export async function listNotesFresh() {
  await db.sync();
  return db.execute("SELECT id, body FROM notes ORDER BY id DESC LIMIT 50");
}

export async function listNotesFast() {
  // cheap, may be a few seconds stale
  return db.execute("SELECT id, body FROM notes ORDER BY id DESC LIMIT 50");
}

Make the staleness explicit in your function names. The teams who get hurt by embedded replicas are the ones where every developer assumes every read is fresh, then a bug appears once a quarter that nobody can explain.

Architecture: where the replica file lives

The replica is a real file. That sentence has consequences. You need to think about three deploy targets differently.

Long-lived VMs and bare metal

This is the simplest case. Mount a persistent volume, point url at it, and let the database grow with your application. After the first sync, restarts and crashes only need to pull the frames written while the process was offline.

Containers in Kubernetes or Fly.io

Pods come and go. If you store the replica on the pod's ephemeral disk, every restart re-syncs from frame zero, which is fine for small databases but painful at gigabyte scale. Use a small persistent volume per pod, or accept the cold start cost and keep the database compact.

On Fly.io, attach a Fly Volume. On Kubernetes, a StatefulSet with a PersistentVolumeClaim works. On AWS ECS you can mount EFS, but watch the latency, since EFS can be slower than the network round trip you were trying to avoid.

Serverless functions

Lambda, Vercel, and Cloudflare Workers each handle this differently. Lambda has read-only function code but a writable /tmp with limited size, and the file does not survive between invocations on cold starts. Vercel functions have similar constraints. Workers do not have a filesystem at all.

In serverless, embedded replicas with full sync are usually the wrong choice. Use Turso's normal HTTP client (no url: file:) so each cold start opens a fresh remote connection, or use Cloudflare D1 if you are already on Workers. Save embedded replicas for environments where the file persists for the lifetime of the process.

Architecture: scaling up the read path

The point of an embedded replica is that scaling reads is the same problem as scaling your web tier. Add an instance, add a replica file. There is no shared bottleneck for reads, no read replica connection pool to size, no load balancer to configure for read routing.

Writes are different. Every instance forwards writes to the same primary, so the primary is the throughput ceiling. If your app pushes more than a few thousand writes per second sustained, you need to think about partitioning or move the write-heavy parts to a different store. Most CRUD apps will never get within 10x of that ceiling.

Trade-offs at a glance

Concern	Embedded replica	Postgres read replica	Edge cache (CDN/KV)
Read latency	Microseconds (local file)	1 to 10 ms (LAN)	Microseconds (in region)
Write latency	Round trip to primary	Round trip to primary	Bypass; cache invalidation
Consistency	Eventually consistent, manual sync	Eventually consistent, replication lag	Best-effort, TTL based
Schema flexibility	SQLite SQL	Full Postgres	Key-value only
Operational burden	Low: managed primary	Medium: replica health, failover	Low to medium
Best fit	Read-heavy, multi-region apps	Mixed workloads, complex SQL	Static or near-static reads

Schema migrations and the embedded replica

Migrations run on the primary. The replica picks them up through normal sync. That sounds simple, but two ordering problems are worth knowing about.

First, if a deploy applies a migration on the primary and rolls out new application code that depends on the new schema, your old pods are still alive for a few seconds with an old replica that has not synced yet. Their queries will fail until the next sync interval. Solutions are familiar from any rolling deploy: make migrations additive (new columns are nullable), keep old code compatible with both shapes, then remove old fields in a follow-up release.

Second, destructive migrations on the primary will eventually replicate too. A DROP TABLE is a write like any other. There is no "point-in-time" mode where the replica stays on the old schema. If you need that, snapshot the database first and restore on demand.

Observability: what to watch in production

Replica lag. Track the time between a write being committed on the primary and the local sync() picking it up. A creeping lag often signals network issues or sync interval misconfiguration.
Sync duration and frame count. A pull that suddenly takes seconds and applies thousands of frames usually means a batch import on the primary. Smooth those imports out or you will see read pauses.
Replica disk usage. SQLite files grow. Run VACUUM periodically on the primary, watch the local file size, and alert before you fill the volume.
Auth token expiry. If your token expires, sync stops silently and reads keep working from a stale snapshot. Use rotation, log sync errors, and alert on consecutive failures.

All four can be measured with a tiny background task that calls sync(), times it, reads the database file size, and emits metrics. Twenty lines of code keeps you honest.

Common pitfalls and how to avoid them

Forgetting the initial sync. Cold-starting a fresh container with no replica file means the very first read returns nothing. Always await db.sync() on boot before serving traffic.
Sync intervals that are too aggressive. A 1-second interval times 50 instances is 50 sync requests per second on an idle system. Most apps do fine at 30 to 120 seconds, with manual sync() on the few hot paths that need it.
Treating the local file as authoritative backup. It is a cache. The primary on Turso Cloud is the source of truth. Back up the primary, not the replicas.
Mixing client modes. Some teams keep one HTTP client and one embedded client in the same process. Pick one per service. Two clients silently disagree about which transactions they see.
Long write transactions. Every write is a round trip. Wrapping 200 inserts in a single transaction with autocommit off can finish in one HTTP call instead of 200. Use db.batch() or transactions for any non-trivial write set.

Quick reference

Task	What to call
Open an embedded replica	`createClient({ url: "file:./data.db", syncUrl, authToken })`
Sync now	`await db.sync()`
Run a query	`await db.execute(sql)`
Parameterised query	`await db.execute({ sql, args: [...] })`
Batch	`await db.batch([{ sql, args }, ...])`
Transaction	`const tx = await db.transaction("write")`
Inspect file size	`fs.statSync("./data/local.db").size`

Next steps

Wire sync() latency and replica file size into your existing metrics pipeline.
Pick three read endpoints in your app and convert them to use the embedded replica behind a feature flag. Compare p95 latencies for one week.
Add a CI check that runs your migrations against a fresh local SQLite to catch syntax differences from Postgres before they hit the primary.
Read the libSQL offline writes public beta if you have mobile or PWA clients. The same sync engine can buffer writes while disconnected and reconcile on reconnect.

Embedded replicas are not magic. They are a deliberate trade where you pay an explicit eventual-consistency tax in return for moving most of your read latency budget into something you no longer have to think about. For read-heavy apps in 2026, that is usually the trade you want.