QWP ingress (WebSocket)

Audience

This is a wire-protocol specification for client implementers building a new QuestDB ingest client from scratch. End users should see the language client guides and the connect string reference.

QuestDB Wire Protocol (QWP) is QuestDB's columnar binary protocol for high-throughput data ingestion over WebSocket. Each message carries one or more table blocks, where every column's values are stored contiguously. Batched messages, schema references, and Gorilla-compressed timestamps reduce wire overhead for sustained streaming workloads.

This page covers WebSocket ingress only. For streaming query results back to clients, see QWP egress (WebSocket).

Why implement a QWP client

If your language already has a QuestDB client, use it — the language client guides list what's available. The rest of this section is for implementers writing a new one (e.g., to bring QWP to JavaScript, Rust, Ruby, .NET, or an embedded runtime that the existing clients don't cover).

Compared with the line-oriented ILP protocols (http, https, tcp), QWP trades a denser binary encoding for higher throughput and lower CPU on both ends:

  • One schema, many batches. After the first message defines a table's columns, subsequent messages reference the schema by an integer ID — no per-row type tags, no per-batch column names.
  • Columnar wire format. Each column's values are contiguous in the message, so the server commits them column-at-a-time without row-by-row parsing. This is the same shape QuestDB uses on disk.
  • Gorilla timestamps. Steady-cadence timestamps collapse from 8 bytes to as little as 1 bit each via delta-of-delta encoding.
  • Global symbol delta dictionary. Low-cardinality string columns send each distinct value once per connection, then reference it by varint ID.
  • Multi-table batches. A single WebSocket frame can carry rows for many tables in one trip across the wire.
  • Server-acknowledged commits. Every batch gets an OK frame carrying the per-table sequencer transaction it landed in, so the client knows precisely what's durable. An optional X-QWP-Request-Durable-Ack opt-in on the upgrade extends this to cluster-durable acks (Enterprise only).

A minimum-viable client that supports BOOLEAN, LONG, DOUBLE, TIMESTAMP, and VARCHAR — the five types that cover most real workloads — is on the order of ~500 lines in a typed language, plus a WebSocket library. Adding the remaining ~20 types is mostly extending switch statements; the framing, schema registry, and ack loop stay the same.

The authoritative reference implementation is java-questdb-client. It's worth keeping open in a tab as you read this page.

Overview

QWP encodes data in a column-major layout: all values for a single column are packed together before the next column begins. This allows the server to decompress and commit each column independently, avoiding row-by-row deserialization.

Design goals:

  • Column-oriented: values for each column are contiguous in the message.
  • Batch-oriented: a single message can carry rows for multiple tables.
  • Schema-referencing: after the first batch, subsequent batches reference a previously sent schema by numeric ID, avoiding redundant column definitions.
  • Timestamp compression: designated timestamp columns can use Gorilla delta-of-delta encoding, reducing 8 bytes per timestamp to as little as 1 bit for steady-rate streams.

Every QWP message begins with a 4-byte magic:

MagicHex valueDescription
QWP10x31505751Standard data message

Transport and versioning

WebSocket endpoints

The client initiates an HTTP GET request to either /write/v4 or /api/v4/write with standard WebSocket upgrade headers. After the server responds with 101 Switching Protocols, all communication uses binary WebSocket frames.

Version negotiation

During the HTTP upgrade, the client and server negotiate the protocol version using custom headers.

Client request headers:

HeaderRequiredDescription
X-QWP-Max-VersionNoMaximum QWP version the client supports (positive integer). Defaults to 1 if absent.
X-QWP-Client-IdNoFree-form client identifier (e.g., java/1.0.2, zig/0.1.0).

Server response header:

HeaderDescription
X-QWP-VersionThe QWP version selected for this connection.

The server selects the version as min(clientMax, serverMax). The selected version is never higher than either side's maximum. The server may also consider the X-QWP-Client-Id when selecting the version.

Connection-level contract

All messages on a connection must carry the negotiated version in the version byte (offset 4) of the message header. The server validates every incoming message against the negotiated version and rejects mismatches with a parse error.

Current version

Ingress is pinned to version 1. No v2 ingest semantics exist. Ingress clients advertise X-QWP-Max-Version: 1.

Authentication

Authentication is handled at the HTTP level during the WebSocket upgrade handshake, before any QWP binary frames are exchanged.

Supported methods:

A failed authentication results in a 401 or 403 HTTP response before the WebSocket connection is established. No QWP-level auth handshake exists.

Client lifecycle

The end-to-end shape of a QWP client session, before the encoding details:

  1. Open WebSocket. Issue an HTTP GET to /write/v4 (or /api/v4/write) with the standard Upgrade: websocket headers, plus:
    • X-QWP-Max-Version: 1 — highest version supported.
    • X-QWP-Client-Id: <name>/<version> — recommended, helps server-side diagnostics and version negotiation.
    • Authentication header (Authorization: Basic … or Authorization: Bearer …).
    • X-QWP-Request-Durable-Ack: true — optional, opt-in for cluster-durable acks (Enterprise).
  2. Verify the upgrade. On 101 Switching Protocols, read the response headers:
    • X-QWP-Version — the version the connection runs on. Use it for the version byte in every outgoing message header. Reject the connection if it's outside the range your client supports.
    • X-QWP-Durable-Ack: enabled — confirms durable-ack frames will follow, iff you opted in. If you opted in and this header is absent, fail the connection (don't silently wait for acks the server will never send).
  3. Send binary frames. Each frame is one QWP message: 12-byte header + payload (Delta Symbol Dictionary if any, then one or more Table Blocks). The first frame for a given table carries a full schema; subsequent frames for the same column set reference it by schema ID.
  4. Drain server responses. The server sends an OK (or error) binary frame per request, in send order. Match responses to requests by their position in your in-flight queue — the server-assigned sequence field in each response is the authoritative confirmation. If you opted in to durable ack, you'll also receive periodic STATUS_DURABLE_ACK frames carrying cumulative per-table watermarks.
  5. Close. Send a WebSocket Close frame after the last expected OK has been drained.

Every reconnect resets connection-scoped state on both sides: schema IDs, symbol dictionary, and sequence counter. Clients that want sender-restart durability layer a store-and-forward buffer on top — see the connect string reference.

Encoding primitives

Byte ordering

All multi-byte numeric values are little-endian. Variable-length integers use unsigned LEB128 (see below).

Variable-length integer encoding (varint)

LEB128

LEB128 (Little Endian Base 128) is a variable-length integer encoding from the DWARF debugging format, also used by Protocol Buffers and WebAssembly. It encodes small values in fewer bytes than fixed-width integers.

QWP uses unsigned LEB128 for variable-length integers. Values are split into 7-bit groups, least significant first. The high bit of each byte is a continuation flag: set (1) means more bytes follow, clear (0) means this is the last byte. A 64-bit value requires at most 10 bytes.

Encoding:

while (value & ~0x7F) != 0:
output_byte((value & 0x7F) | 0x80)
value >>= 7
output_byte(value)

Decoding:

result = 0
shift = 0
while True:
b = read_byte()
result |= (b & 0x7F) << shift
shift += 7
if (b & 0x80) == 0:
break
return result

Examples:

ValueEncoded bytes
00x00
10x01
1270x7F
1280x80 0x01
2550xFF 0x01
3000xAC 0x02
163840x80 0x80 0x01

ZigZag encoding

ZigZag encoding

ZigZag encoding maps signed integers to unsigned integers so that values with small absolute values produce small varints. It was popularized by Protocol Buffers.

def zigzag_encode(n):
return (n << 1) ^ (n >> 63)

def zigzag_decode(n):
return (n >> 1) ^ -(n & 1)
SignedUnsigned
00
-11
12
-23
24

Message structure

Message header (12 bytes, fixed)

Offset  Size  Type    Field           Description
------ ---- ------ ------------- --------------------------------
0 4 int32 magic "QWP1" (0x31505751)
4 1 uint8 version Protocol version (0x01)
5 1 uint8 flags Encoding flags
6 2 uint16 table_count Number of table blocks
8 4 uint32 payload_length Payload size in bytes

Total message size = 12 + payload_length.

Flags byte

BitMaskNameDescription
0-1ReservedMust be 0
20x04FLAG_GORILLAGorilla delta-of-delta encoding for timestamp columns
30x08FLAG_DELTA_SYMBOL_DICTDelta symbol dictionary mode enabled
4-7ReservedMust be 0

Complete message layout

+---------------------------------------------+
| Message Header (12 bytes) |
+---------------------------------------------+
| Payload (variable) |
| +- [Delta Symbol Dictionary] (if 0x08) |
| +- Table Block 0 |
| +- Table Block 1 |
| +- ... Table Block N-1 |
+---------------------------------------------+

Delta symbol dictionary

Present only when FLAG_DELTA_SYMBOL_DICT (0x08) is set. Appears at the start of the payload, before any table blocks.

+------------------------------------------------------------+
| delta_start: varint Starting global ID for this delta |
| delta_count: varint Number of new entries |
| For each new entry: |
| name_length: varint UTF-8 byte length |
| name_bytes: bytes UTF-8 encoded symbol string |
+------------------------------------------------------------+

The client maintains a global symbol dictionary mapping symbol strings to sequential integer IDs starting from 0. On each batch, only newly added symbols (the "delta") are transmitted. The server accumulates these entries across batches for the lifetime of the connection.

WebSocket clients set FLAG_DELTA_SYMBOL_DICT on every message and use global delta dictionaries exclusively. Symbol columns then contain varint-encoded global IDs instead of per-column dictionaries.

On connection loss, both sides reset the dictionary.

Table blocks

Each table block contains data for a single table.

+----------------------------------+
| Table Header (variable) |
+----------------------------------+
| Schema Section (variable) |
+----------------------------------+
| Column Data (variable) |
| +- Column 0 data |
| +- Column 1 data |
| +- ... Column N-1 data |
+----------------------------------+

Table header

FieldTypeDescription
name_lengthvarintTable name length in bytes
nameUTF-8Table name (max 127 bytes)
row_countvarintNumber of rows in this block
column_countvarintNumber of columns

Schema definition

The schema section immediately follows the table header and defines the columns in the block.

Schema mode byte

ValueModeDescription
0x00FullSchema ID + complete column definitions inline
0x01ReferenceSchema ID only (lookup from registry)

Full schema mode (0x00)

Sent the first time a table's schema appears on a connection, or whenever the column set changes.

+----------------------------------+
| mode_byte: 0x00 |
+----------------------------------+
| schema_id: varint |
+----------------------------------+
| Column Definition 0 |
| +- name_length: varint |
| +- name: UTF-8 bytes |
| +- type_code: uint8 |
+----------------------------------+
| Column Definition 1 ... |
+----------------------------------+

Schema IDs are non-negative integers assigned by the client and scoped to the lifetime of a single connection. They are global across all tables on the connection (not per-table). Clients typically assign them sequentially starting at 0, but the server does not require any particular ordering.

A column with an empty name (length 0) and type TIMESTAMP denotes the designated timestamp column, the per-table column that QuestDB uses for time-based partitioning and ordering.

Reference schema mode (0x01)

Used for subsequent batches when the server has already registered the schema.

+-------------------------+
| mode_byte: 0x01 |
+-------------------------+
| schema_id: varint |
+-------------------------+

The server looks up the schema by its ID in the per-connection schema registry.

Schema registry lifecycle

  1. First batch for a table: full schema mode with a new schema ID.
  2. Subsequent batches with the same columns: reference mode with the same ID.
  3. When a table gains a column, the client assigns a new schema ID and sends it in full mode.
  4. Full-mode schemas may re-register an existing ID; the server accepts any ID within the per-connection schema-ID limit.
  5. On reconnect, both sides reset: the client reassigns IDs from 0 and the server clears its registry.

Column types

CodeHexTypeSizeDescription
10x01BOOLEAN1 bitBit-packed boolean
20x02BYTE1Signed 8-bit integer
30x03SHORT2Signed 16-bit integer
40x04INT4Signed 32-bit integer
50x05LONG8Signed 64-bit integer
60x06FLOAT4IEEE 754 single precision
70x07DOUBLE8IEEE 754 double precision
90x09SYMBOLvarDictionary-encoded string
100x0ATIMESTAMP8Microseconds since Unix epoch
110x0BDATE8Milliseconds since Unix epoch
120x0CUUID16RFC 4122 UUID
130x0DLONG25632256-bit integer
140x0EGEOHASHvarGeospatial hash
150x0FVARCHARvarLength-prefixed UTF-8
160x10TIMESTAMP_NANOS8Nanoseconds since Unix epoch
170x11DOUBLE_ARRAYvarN-dimensional double array
180x12LONG_ARRAYvarN-dimensional long array
190x13DECIMAL648Decimal (18 digits precision)
200x14DECIMAL12816Decimal (38 digits precision)
210x15DECIMAL25632Decimal (77 digits precision)
220x16CHAR2Single UTF-16 code unit
230x17BINARYvarLength-prefixed opaque bytes
240x18IPv4432-bit IPv4 address

Code 0x08 is unassigned. It was previously STRING, which has been removed. Use VARCHAR (0x0F) for text columns.

TIMESTAMP and TIMESTAMP_NANOS may use Gorilla encoding when FLAG_GORILLA is set. See Timestamp encoding below.

Null handling

Each column's data section begins with a 1-byte null flag. The flag tells the decoder how nulls are represented in the data that follows.

Sentinel mode (null flag = 0x00)

No bitmap follows. The column data contains one value per row (row_count values total). Null rows are represented by a reserved marker value (a "sentinel") that falls outside the column's valid range. For example, 0x00 for BYTE or 0x0000 for SHORT. The decoder recognizes these values as null rather than as real data.

Sentinel mode requires the type to have a dedicated null representation. Types whose full value range is meaningful payload (e.g., VARCHAR, SYMBOL) cannot use sentinel mode.

Bitmap mode (null flag != 0x00)

A null bitmap follows immediately after the flag byte. The column data then contains only non-null values, densely packed (value_count = row_count - null_count).

Bitmap format:

  • Size: ceil(row_count / 8) bytes
  • Bit order: LSB first within each byte
  • Semantics: bit = 1 means the row is NULL, bit = 0 means the row has a value
Byte 0:  [row7][row6][row5][row4][row3][row2][row1][row0]
Byte 1: [row15][row14][row13][row12][row11][row10][row9][row8]
...

Accessing null status:

byte_index = row_index // 8
bit_index = row_index % 8
is_null = (bitmap[byte_index] & (1 << bit_index)) != 0

Example: 10 rows where rows 0, 2, and 9 are null:

Byte 0: 0b00000101 = 0x05  (bits 0 and 2 set)
Byte 1: 0b00000010 = 0x02 (bit 1 set = row 9)

Complete column data layout

+------------------------------------------------------------+
| null_flag: uint8 0 = sentinel, nonzero = bitmap |
| [null bitmap: ceil(row_count/8) bytes if flag != 0] |
| Column values: |
| flag == 0 : row_count entries (null rows use sentinels) |
| flag != 0 : value_count non-null entries, densely packed |
| (value_count = row_count - null_count) |
+------------------------------------------------------------+

The encoder chooses the strategy per column. The decoder must support both.

Sentinel values

When the reference implementation emits sentinel mode (null flag = 0x00), null rows are encoded as:

TypeSentinel
BOOLEANbit 0 (false)
BYTE0x00
SHORT0x0000
CHAR0x0000
GEOHASHAll-ones (0xFF...FF), truncated to ceil(precision_bits / 8) bytes

Reference implementation null strategy

The reference Java client uses these strategies per type:

StrategyTypes
SentinelBOOLEAN, BYTE, SHORT, CHAR, GEOHASH
BitmapINT, LONG, FLOAT, DOUBLE, VARCHAR, SYMBOL, TIMESTAMP, TIMESTAMP_NANOS, DATE, UUID, LONG256, DECIMAL64, DECIMAL128, DECIMAL256, DOUBLE_ARRAY, LONG_ARRAY

Alternative implementations may make different per-column choices as long as the null flag accurately describes the data that follows. A column with no null rows produces identical output under either strategy (null flag = 0x00, row_count values).

Column data encoding

Fixed-width types

For BYTE, SHORT, INT, LONG, FLOAT, DOUBLE, DATE, CHAR, and IPv4: values are written as contiguous arrays of their respective sizes in little-endian byte order.

+------------------------------------------------------+
| [Null flag + bitmap (see Null handling)] |
+------------------------------------------------------+
| Values: |
| value[0], value[1], ... value[N-1] |
| N = row_count if null_flag == 0 |
| N = row_count - null_count if null_flag != 0 |
+------------------------------------------------------+

Boolean

Values are bit-packed, 8 per byte, LSB-first. ceil(N/8) bytes are written where N = row_count in sentinel mode or N = row_count - null_count in bitmap mode. The reference implementation uses sentinel mode for BOOLEAN: null rows appear as bit 0 (false).

Values [true, false, true, true, false, false, false, true]:
0b10001101 = 0x8D

VARCHAR and BINARY

VARCHAR, and BINARY share the same wire format:

+--------------------------------------------------+
| [Null flag + bitmap (see Null handling)] |
+--------------------------------------------------+
| Offset array: (value_count + 1) x uint32 LE |
| offset[0] = 0 |
| offset[i+1] = end of value[i] |
+--------------------------------------------------+
| Data: concatenated bytes |
+--------------------------------------------------+
  • value_count = row_count - null_count
  • Offsets are uint32, little-endian (all multi-byte numeric values in QWP are little-endian — restated here because the diagram is often skimmed).
  • Value i spans bytes [offset[i], offset[i+1])
  • For VARCHAR, the bytes are valid UTF-8. For BINARY, the bytes are opaque.
  • The uint32 offsets bound individual values to 2^31 - 1 bytes.

Symbol

Dictionary-encoded strings for low-cardinality columns.

WebSocket uses global delta dictionaries only

WebSocket clients set FLAG_DELTA_SYMBOL_DICT (0x08) on every message and use the global delta dictionary mode exclusively. The per-table dictionary mode used by UDP datagrams is not covered here.

The dictionary entries themselves are sent in the message-level delta symbol dictionary section. Column data for a SYMBOL column is then just a sequence of varint-encoded global IDs, one per non-null row:

+--------------------------------------------+
| [Null flag + bitmap (see Null handling)] |
+--------------------------------------------+
| For each non-null row: |
| global_id: varint Global symbol ID |
+--------------------------------------------+

The client owns the global ID assignment. Each new string gets the next sequential integer, starting from 0 on a fresh connection. Only the new entries since the previous message are transmitted; the server accumulates the dictionary for the lifetime of the connection.

Timestamp encoding

Gorilla compression

Gorilla is a time-series compression scheme from the Facebook/Meta Gorilla paper (Pelkonen et al., VLDB 2015). It exploits the regularity of timestamps in time-series data by encoding the delta-of-deltas between consecutive values, which are often zero or very small.

When FLAG_GORILLA (0x04) is not set, timestamp columns are written as plain int64 arrays with no encoding flag:

+----------------------------------------------+
| [Null flag + bitmap (see Null handling)] |
+----------------------------------------------+
| Timestamp values (non-null only): |
| value_count x int64 |
+----------------------------------------------+

When FLAG_GORILLA (0x04) is set, a 1-byte encoding flag follows the null handling section:

FlagModeDescription
0x00UncompressedArray of int64 values (non-null only)
0x01GorillaDelta-of-delta compressed

Uncompressed mode (0x00):

+----------------------------------------------+
| [Null flag + bitmap (see Null handling)] |
+----------------------------------------------+
| encoding_flag: uint8 (0x00) |
+----------------------------------------------+
| Timestamp values (non-null only): |
| value_count x int64 |
+----------------------------------------------+

Gorilla mode (0x01):

+----------------------------------------------+
| [Null flag + bitmap (see Null handling)] |
+----------------------------------------------+
| encoding_flag: uint8 (0x01) |
+----------------------------------------------+
| first_timestamp: int64 |
+----------------------------------------------+
| second_timestamp: int64 |
+----------------------------------------------+
| Bit-packed delta-of-deltas: |
| For timestamps 3..N |
+----------------------------------------------+

Gorilla delta-of-delta algorithm

The first two timestamps are written in full as int64 values. Starting from the third timestamp (index i = 2), each subsequent value is encoded as a delta-of-deltas:

delta_i  = t[i] - t[i - 1]
dod_i = delta_i - delta_{i-1} # delta_{i-1} = t[i-1] - t[i-2]

The very first encoded DoD applies at i = 2, where delta_{i-1} = t[1] - t[0]. There is no implicit zero-delta anchor before that.

Encoding buckets (bits are written LSB-first):

ConditionPrefixValue bitsTotal bits
DoD == 0001
DoD in [-64, 63]107 (signed)9
DoD in [-256, 255]1109 (signed)12
DoD in [-2048, 2047]111012 (signed)16
Otherwise111132 (signed)36

The bit stream is padded to a byte boundary at the end. If any DoD value exceeds the 32-bit signed integer range, the encoder falls back to uncompressed mode.

UUID

16 bytes per value: 8 bytes for the low 64 bits, then 8 bytes for the high 64 bits, both little-endian.

LONG256

32 bytes per value: four int64 values, least significant first, all little-endian.

GeoHash

+------------------------------------------------------+
| [Null flag + bitmap (see Null handling)] |
+------------------------------------------------------+
| precision_bits: varint (1-60) |
+------------------------------------------------------+
| Packed geohash values: |
| bytes_per_value = ceil(precision_bits / 8) |
| total = bytes_per_value x N |
| N = row_count if null_flag == 0 |
| N = row_count - null_count if null_flag != 0 |
+------------------------------------------------------+

The reference implementation uses sentinel mode for GEOHASH: null rows are encoded as all-ones truncated to bytes_per_value.

Array types (DOUBLE_ARRAY, LONG_ARRAY)

N-dimensional arrays, row-major order:

+------------------------------------------------------+
| For each non-null row: |
| n_dims: uint8 Number of dimensions |
| dim_lengths: n_dims x int32 Length per dimension |
| values: product(dims) x element |
| (float64 for DOUBLE_ARRAY, |
| int64 for LONG_ARRAY) |
+------------------------------------------------------+

Decimal types (DECIMAL64, DECIMAL128, DECIMAL256)

Decimal values are stored as two's complement integers. A 1-byte scale prefix is shared by all values in the column. The scale is the number of decimal digits to the right of the decimal point — i.e., the real value is reconstructed as:

value = unscaled_int / 10^scale

For example, with scale = 3 an unscaled int64 of 12345 decodes to 12.345. The scale is base-10, not base-2.

+----------------------------------------------+
| [Null flag + bitmap (see Null handling)] |
+----------------------------------------------+
| scale: uint8 |
+----------------------------------------------+
| Unscaled values: |
| DECIMAL64: 8 bytes x value_count |
| DECIMAL128: 16 bytes x value_count |
| DECIMAL256: 32 bytes x value_count |
+----------------------------------------------+
TypeValue sizePrecision
DECIMAL648 bytes18 digits
DECIMAL12816 bytes38 digits
DECIMAL25632 bytes77 digits

Server responses

Every response starts with a 1-byte status code. OK and error responses include an 8-byte sequence number that correlates the response with the original request.

Sequence numbering

The QWP wire encoder does not put a sequence number into the request header — the message header at offset 0 ends at offset 12 with payload_length, and that is the entire client-side framing. The server assigns the sequence number itself: it counts inbound binary frames on the connection (starting at 0) and echoes the assigned wireSeq in the sequence field of every OK and error frame.

Two consequences for client implementers:

  • Frames must be sent in strict order. The server assumes "the Nth frame received is wireSeq = N", so any reordering by the client breaks the mapping between requests and responses.
  • Match responses by send order. The client tracks an ordered list of outstanding messages; the next OK/error response always corresponds to the oldest unacknowledged message, and the sequence field is the server's authoritative confirmation of which one.

On a fresh connection both sides start at 0. On reconnect both sides reset.

OK response

+------------------------------------------------------+
| status: uint8 (0x00) |
| sequence: int64 Request sequence number |
| tableCount: uint16 Number of table entries |
| Repeated tableCount times: |
| nameLen: uint16 Table name length |
| name: bytes UTF-8 table name |
| seqTxn: int64 Sequencer txn for table |
+------------------------------------------------------+

The per-table entries report the sequencer transaction assigned to each table that committed data in the acknowledged batch. tableCount is 0 when no WAL (Write-Ahead Log) tables committed (e.g., non-WAL tables or empty batches).

Error response

+-----------------------------------------------------+
| status: uint8 Status code |
| sequence: int64 Request sequence number |
| msg_len: uint16 Error message length |
| msg_bytes: bytes UTF-8 error message |
+-----------------------------------------------------+

Status codes

CodeHexNameDescription
00x00OKBatch accepted (written to WAL)
20x02DURABLE_ACKBatch WAL uploaded to object store (Enterprise)
30x03SCHEMA_MISMATCHColumn type incompatible with existing table
50x05PARSE_ERRORMalformed message
60x06INTERNAL_ERRORServer-side error
80x08SECURITY_ERRORAuthorization failure
90x09WRITE_ERRORWrite failure (e.g., table not accepting writes)

Durable acknowledgement

Enterprise

Durable acknowledgement (status code 0x02) is available in QuestDB Enterprise with primary replication configured. Open source QuestDB returns OK (0x00) or error responses only.

A standard OK confirms the batch was committed to the server's local WAL. To receive a second acknowledgement after the WAL has been durably uploaded to the configured object store, include X-QWP-Request-Durable-Ack: true (case-insensitive) in the WebSocket upgrade request.

If the server accepts the opt-in, it echoes X-QWP-Durable-Ack: enabled in the 101 response. Clients that opt in must verify this header is present and fail the connect attempt if it is absent.

Durable-ack response format:

+------------------------------------------------------+
| status: uint8 (0x02) |
| tableCount: uint16 Number of table entries |
| Repeated tableCount times: |
| nameLen: uint16 Table name length |
| name: bytes UTF-8 table name |
| seqTxn: int64 Durably-uploaded seqTxn |
+------------------------------------------------------+

The durable-ack has no sequence field. It carries cumulative per-table watermarks that advance as uploads complete. Only tables whose durable watermark advanced since the last durable-ack are included.

The durable-ack watermark always trails the regular OK watermark. Empty messages (those that produced no WAL commit, for example messages that only reference materialized views) are trivially durable; their sequence advances the durable watermark as soon as all preceding messages are durable.

Reconnects discard any in-flight durable-ack tracking. The new connection re-OKs replayed batches and the server re-emits cumulative durable-ack watermarks from scratch, so the client's trim watermark must restart against the new connection's wire sequencing.

Servers without replication silently ignore the request header and never emit durable-ack frames. There is no durable-failure status; persistent upload failures surface only as absence of a durable-ack frame.

Protocol limits

LimitDefault value
Max batch size16 MB
Max tables per connection10,000
Max rows per table block1,000,000
Max columns per table2,048
Max table name length127 bytes
Max column name length127 bytes
Max in-flight batches128
Max symbol dictionary entries1,000,000

The header's table_count field is a uint16, so the protocol ceiling for tables per message is 65,535 regardless of the configured limit. Individual string values have no dedicated length limit; they are bounded only by the max batch size.

The symbol dictionary limit applies per column in per-table dictionary mode and per connection in global delta dictionary mode. Exceeding it causes the server to reject the message with PARSE_ERROR.

Practical WebSocket frame cap

The 16 MB max batch is a QWP protocol ceiling, not an effective server-side cap. The HTTP receive buffer used by the WebSocket plumbing is typically smaller, and it is checked before the QWP parser ever sees the payload:

Server config keyDefaultEffect
http.recv.buffer.size2 MiBMaximum WebSocket frame the server will accept on /write/v4.

A WebSocket binary frame larger than this is rejected immediately with close code 1009 MESSAGE_TOO_BIG and the connection is dropped — the client will observe an abrupt disconnect (ECANCELED, EPIPE, or similar depending on the WebSocket library) partway through the send.

The effective per-message size limit is therefore min(http.recv.buffer.size, 16 MiB) − WebSocket frame overhead (≤ 14 bytes).

Recommendation for client implementers: keep individual QWP messages comfortably under the server's http.recv.buffer.size — for the default 2 MiB recv buffer, a 1.9 MiB / ~25k-row ceiling per message is a safe target. Operators who want larger batches must raise http.recv.buffer.size on the server (e.g., http.recv.buffer.size=17m to use the full QWP 16 MB headroom).

Client operation

This section describes the high-level batching and I/O behavior a client implements. The full client-side substrate (on-disk store-and-forward, frame sequence numbers, ACK-driven trim, reconnect/replay semantics) is specified in the connect string reference.

Double-buffered async I/O

The client uses double-buffered microbatches:

  1. The user thread writes rows to the active buffer.
  2. When a buffer reaches its threshold (row count, byte size, or age), the client seals it and enqueues it for sending.
  3. A dedicated I/O thread sends batches over the WebSocket.
  4. The client swaps to the other buffer so writing can continue without blocking.

Auto-flush triggers

TriggerDefault
Row count1,000 rows
Byte sizedisabled
Time since first row100 ms

Failover and high availability

Ingress senders use a reconnect loop regardless of whether store-and-forward is configured. The two storage modes share identical failover semantics; they differ only in where unacknowledged data lives:

  • sf_dir set (store-and-forward): segments are memory-mapped files under sf_dir. Unacknowledged data survives sender restarts and is replayed by the next sender bound to the same slot.
  • sf_dir unset (memory mode): segments are allocated in process memory. Unacknowledged data is lost if the sender process dies. The reconnect loop still spans transient server outages such as rolling upgrades, but the RAM buffer caps how much data can accumulate during the outage.

Connect-string keys that control ingress failover are documented in the reconnect and failover section of the connect string reference:

KeyDefaultDescription
reconnect_max_duration_millis300000Total outage budget before giving up.
reconnect_initial_backoff_millis100First post-failure sleep.
reconnect_max_backoff_millis5000Cap on per-attempt sleep.
initial_connect_retryoffRetry on first connect (on, sync, async).

Key behaviors:

  • Ingress is zone-blind. It pins QWP v1 and never reads SERVER_INFO, so every host's zone tier is equivalent and selection is based on health state only. The zone= connect-string key is accepted but silently ignored, so a connect string shared with egress clients works unchanged on ingress.
  • Authentication errors are terminal at any host (401/403). The reconnect loop does not continue past them.
  • 421 + X-QuestDB-Role is a role reject: transient if the role is PRIMARY_CATCHUP, topology-level otherwise.
  • All other upgrade errors are transient and feed into the reconnect loop, including 404, 426, 503, generic 4xx/5xx, TCP/TLS failures, mid-stream send/recv errors, and an upgrade response that advertises a QWP version outside the client's supported range (per-endpoint, so a host on a rolling upgrade does not lock the client out of compatible peers).
Enterprise

Multi-host failover with automatic reconnect requires QuestDB Enterprise.

Examples

Single table with three columns

Table sensors, 2 rows, 3 columns: id (LONG), value (DOUBLE), ts (TIMESTAMP). No nulls, no Gorilla compression, no delta symbol dictionary.

# Header (12 bytes)
51 57 50 31 # Magic: "QWP1"
01 # Version: 1
00 # Flags: none
01 00 # Table count: 1
XX XX XX XX # Payload length

# Table Block
07 # Table name length: 7
73 65 6E 73 6F 72 73 # "sensors" UTF-8
02 # Row count: 2
03 # Column count: 3

# Schema (full mode)
00 # Schema mode: full
00 # Schema ID: 0

# Column 0: id (LONG)
02 # Name length: 2
69 64 # "id" UTF-8
05 # Type: LONG

# Column 1: value (DOUBLE)
05 # Name length: 5
76 61 6C 75 65 # "value" UTF-8
07 # Type: DOUBLE

# Column 2: ts (TIMESTAMP, designated)
00 # Name length: 0 (designated timestamp)
0A # Type: TIMESTAMP

# Column 0 data (LONG, 2 values)
00 # null_flag: 0x00 (no bitmap)
01 00 00 00 00 00 00 00 # id = 1
02 00 00 00 00 00 00 00 # id = 2

# Column 1 data (DOUBLE, 2 values)
00 # null_flag: 0x00 (no bitmap)
CD CC CC CC CC CC F4 3F # value = 1.3
9A 99 99 99 99 99 01 40 # value = 2.2

# Column 2 data (TIMESTAMP, uncompressed, 2 values)
00 # null_flag: 0x00 (no bitmap)
00 E4 0B 54 02 00 00 00 # ts = 10000000000 microseconds
80 1A 06 00 00 00 00 00 # ts = 400000 microseconds

Nullable VARCHAR column

4 rows where row 1 is null:

# Null flag + bitmap
01 # null_flag: nonzero = bitmap follows
02 # 0b00000010 (bit 1 set = row 1 is null)

# Offset array (3 non-null values = 4 offsets)
00 00 00 00 # offset[0] = 0 (start of "foo")
03 00 00 00 # offset[1] = 3 (end of "foo")
06 00 00 00 # offset[2] = 6 (end of "bar")
09 00 00 00 # offset[3] = 9 (end of "baz")

# String data (concatenated UTF-8)
66 6F 6F # "foo" (row 0)
62 61 72 # "bar" (row 2)
62 61 7A # "baz" (row 3)

Gorilla timestamps with delta symbol dictionary

Table sensors, 2 rows, 3 columns: host (SYMBOL), temp (DOUBLE), designated TIMESTAMP. Both FLAG_GORILLA and FLAG_DELTA_SYMBOL_DICT are set.

# Header (12 bytes)
51 57 50 31 # Magic: "QWP1"
01 # Version: 1
0C # Flags: 0x04 (Gorilla) | 0x08 (Delta Symbol Dict)
01 00 # Table count: 1
XX XX XX XX # Payload length

# Delta Symbol Dictionary
00 # delta_start = 0
02 # delta_count = 2
07 73 65 72 76 65 72 31 # "server1" (length = 7)
07 73 65 72 76 65 72 32 # "server2" (length = 7)

# Table Block
07 73 65 6E 73 6F 72 73 # Table name "sensors" (length = 7)
02 # row_count = 2
03 # column_count = 3

# Schema (full mode)
00 # schema_mode = FULL
00 # schema_id = 0
04 68 6F 73 74 09 # "host" : SYMBOL
04 74 65 6D 70 07 # "temp" : DOUBLE
00 0A # "" : TIMESTAMP (designated)

# Column 0 (SYMBOL, global delta IDs)
00 # null_flag: no nulls
00 # Row 0: global ID 0
01 # Row 1: global ID 1

# Column 1 (DOUBLE, 2 values)
00 # null_flag: no nulls
66 66 66 66 66 E6 56 40 # 91.6
9A 99 99 99 99 19 57 40 # 92.4

# Column 2 (TIMESTAMP, Gorilla)
00 # null_flag: no nulls
01 # encoding = Gorilla
[8 bytes: first timestamp]
[8 bytes: second timestamp]
# (only 2 values, so no delta-of-delta bit stream follows)

Reference implementation

The reference client implementation is java-questdb-client at commit 67bb5e4.

The server-side protocol parser lives in the QuestDB server repository under core/src/main/java/io/questdb/cutlass/qwp/protocol/.

Version history

VersionDescription
1 (0x01)Initial binary protocol release