DB Bench

Neon vs Xata

Methodology

DB Bench is a continuous benchmarking system that measures cold start, warm latency, query performance, write performance, and branch creation time for Neon and Xata serverless Postgres providers. This page documents the testing methodology, known limitations, and design decisions.

How cold start is measured

Cold start data is gathered passively. Both providers are configured with a 5-minute scaleToZero idle timeout. A dedicated cold-start endpoint is scheduled every 15 minutes — well beyond the idle timeout — so both databases should already be hibernated when probed. No in-worker sleeping or active suspension is performed.

Before each probe, we observe the provider's current state via their management API:

Each cold-start run collects 1 sample per provider. The pre-probe state is recorded in sample metadata as preProbeState with a coldVerified flag. If the database was unexpectedly warm (e.g., another process woke it), the sample is still recorded but flagged so the dashboard can filter or label it.

Cold-start data accumulates over time — roughly 96 samples per day per provider at the 15-minute cadence.

Same driver, fair comparison

Both providers use the same database driver: @neondatabase/serverless (HTTP). Xata natively supports the Neon serverless HTTP protocol, so queries are sent as HTTP requests for both — no TCP handshake, no TLS negotiation per connection. This eliminates protocol asymmetry and gives an apples-to-apples comparison.

Every measurement reports connect_ms and query_ms separately. With the HTTP driver, connect_ms captures client instantiation time (near-zero for both), while query_ms captures the full HTTP round-trip including database execution. The total_ms is the end-to-end measurement.

Both providers use their best available driver from a Cloudflare Worker. The HTTP driver is purpose-built for serverless environments and is recommended by both Neon and Xata for edge/serverless deployments.

Why Cloudflare Workers

We run benchmarks from Cloudflare Workers rather than a traditional server for several reasons:

Test types

TestSamplesDescription
cold_start1TTFB after passive idle timeout. Pre-probe state observed via management API. Scheduled every 15 min.
warm_latency20SELECT 1 on a warm connection. Baseline latency.
simple_select10SELECT id, uuid, category FROM bench_items LIMIT 10
indexed_lookup10SELECT * FROM bench_items WHERE id = $1 with random primary key
aggregation5SELECT category, COUNT(*), AVG(value) FROM bench_items GROUP BY category
write10INSERT INTO bench_items (...) RETURNING id with random data
branch_create3Create branch → poll readiness → verify data → delete. Full lifecycle.

Sample sizes and frequency

Three separate QStash cron schedules drive data collection:

Percentile calculations (p50, p95, p99) are computed over the successful samples from recent runs. Only samples marked success: true are included in percentile calculations.

Target database

Both providers are seeded with the same dataset: 100,000 rows in a bench_items table with 20 categories, random values, and ~200-character text payloads. The table has indexes on category and created_at. The seed is idempotent — it checks row count before inserting.

Known limitations

Disclosure

This project is not affiliated with or endorsed by Neon or Xata. It is an independent benchmarking effort. All source code is open and available for review.

Statistics

Percentiles are computed using linear interpolation between sorted sample values. For a percentile p with n values:

This matches the standard "percentile" function used by most analytics tools. Raw samples are stored indefinitely — no rollup or aggregation is applied at the storage layer.