cotalks.dev

High-Speed Crypto Trading: JVM Techniques Behind Bitvavo’s µs Revolution - Oleg Lobanov, Marcos Maia

(link)
Channel: Devoxx

Summary

This talk explains how Bitvavo evolved its crypto exchange from a startup architecture into a low-latency, deterministic trading system on the JVM. The speakers outline the original Redis/MySQL/Kafka-based design, the scaling and correctness problems it created, and the requirements for a regulated exchange: stable latency, high availability, horizontal scalability, and deterministic state transitions. The deep dive covers the move to a replicated state machine backed by Raft, with a strict rule set around determinism: no external side effects in the matching path, no wall-clock dependence, no randomness unless seeded, predictable iteration order, and single-threaded processing. It also describes the networking and encoding stack used to reduce latency, including Aeron’s reliable UDP transport and SBE for zero-copy binary message handling. The final part focuses on JVM and system-level optimization techniques such as zero allocation, object pooling, primitive collections, cache-friendly data structures, CPU pinning, and kernel bypass networking. The result presented is sub-millisecond end-to-end latency, around 5 microseconds of internal processing time, and throughput of roughly 100,000 orders per second.

Key Takeaways

  • Bitvavo replaced a startup-style Redis/MySQL/Kafka architecture with a replicated state machine for deterministic order processing.
  • Raft is used to replicate state across nodes while preserving identical outcomes from the same input log.
  • Determinism requirements include no external calls, no wall-clock dependency, no randomness, and predictable iteration order.
  • Aeron provides reliable UDP-style transport, while SBE enables schema-first, zero-copy binary message handling.
  • Low-latency JVM tuning focuses on zero allocation, primitive collections, single-threaded processing, cache locality, and CPU pinning.
  • The architecture supports rolling upgrades by replaying input logs and verifying that new versions produce the same state.
  • Bitvavo reports around 500 microseconds average end-to-end latency, under 1 ms P99, and roughly 100k orders per second.

Sections

Bitvavo’s exchange context and growth

Bitvavo is a crypto exchange headquartered in Amsterdam, operating since 2018. The speakers describe the company’s move from a small startup team to a larger engineering organization and explain how that growth forced changes in the exchange’s architecture. The business is regulated in Europe, with a focus on GDPR, MiCA, and operational constraints that shape the technical design.

Why the original architecture stopped scaling

The early system used a gateway, Redis pub/sub, matching engines, MySQL for storage, Visium for transition tracking, and Kafka for audit logging. It was effective for speed of delivery, but it had major limitations: non-deterministic behavior from parallel processing, shared-resource contention, unpredictable latency, and limited throughput. These issues mattered especially for a trading system where the same input must always produce the same output.

Replicated state machine and Raft

To satisfy correctness and availability requirements, Bitvavo moved to a replicated state machine architecture. Raft is used to replicate commands across nodes so that all machines process the same ordered input log and converge to identical state. The talk emphasizes that this design is lock-free in the hot path, supports high availability through multiple synchronized nodes, and makes debugging easier because a production state can be reproduced locally by replaying the same ingress log.

Determinism rules in the matching path

The core rule is absolute determinism. The speakers call out several constraints: no database or external service calls inside the matching path, no use of wall-clock time, no randomness unless it is seeded and reproducible, and no unordered iteration over hash maps. They also stress that the system is single-threaded to eliminate scheduler-dependent behavior and lock contention.

Networking and message encoding optimizations

To reduce transport overhead, the system moved away from JSON over TCP. The talk explains why TCP can add latency variance and why UDP alone is not safe for exchange traffic. Bitvavo uses Aeron, which provides a reliable transport model with sequence numbers, NACK-based retransmission, and heartbeats. For message encoding, they chose SBE (Simple Binary Encoding), which uses schema-first layouts and zero-copy access to raw bytes, improving cache locality and reducing garbage creation.

JVM and hardware-level low-latency practices

The speakers describe a set of JVM optimization practices used in the exchange: zero allocation on the hot path, object pools, ring buffers, direct buffers, flyweights, primitive collections, and avoiding boxing and strings. On the concurrency side, they rely on single-writer principles to avoid locks and false sharing. They also pin critical threads to CPU cores, isolate those cores from other work, and use kernel bypass networking to avoid unnecessary kernel overhead.

Deployment, upgrades, and feature rollout

Deploying a deterministic distributed system requires careful versioning. The talk covers two approaches: stop-the-world upgrades, which are safest but involve downtime, and rolling upgrades, which avoid downtime but require strong verification. Bitvavo validates new versions by replaying production logs in CI and comparing the resulting state. Feature flags must also be part of the command log rather than external configuration, so every node transitions at the same logical point.

Performance results and operational implications

The presented results show less than 1 millisecond end-to-end P99 latency, about 500 microseconds average latency, around 5 microseconds of internal processing time, and approximately 100,000 orders per second throughput. The speakers emphasize that these numbers are not treated as a hard architectural ceiling, but as the current state of a system designed to scale further while preserving predictability, correctness, and auditability.

Keywords: bitvavo crypto exchange, jvm low latency trading, replicated state machine, raft consensus, aeron reliable udp, simple binary encoding sbe, deterministic order matching, high frequency trading java, single-threaded matching engine, zero allocation jvm, object pooling, primitive collections, cache locality, false sharing, cpu pinning, kernel bypass networking, microsecond latency, order book matching engine, crypto exchange architecture, rolling upgrades distributed systems, financial audit log, regulatory compliance mica, redis pub/sub to raft migration, json vs binary protocol

note