Canton node performance depends on database throughput, JVM configuration, sequencer capacity, and pruning strategy. This page covers the key tuning areas for validator and SV node operators.Documentation Index
Fetch the complete documentation index at: https://cantonfoundation-issue-565-splice-daml-nav.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Database optimization
PostgreSQL is the primary performance bottleneck in most Canton deployments.Connection pools
Canton uses HikariCP for database connection pooling. The default pool size works for light workloads, but high-throughput deployments benefit from tuning:maxConnections based on your PostgreSQL max_connections setting and the number of Canton processes sharing the database server. A good starting point is max_connections / number_of_canton_processes, leaving headroom for monitoring and maintenance connections.
PostgreSQL tuning
These PostgreSQL parameters have the most impact on Canton workloads:shared_buffers— Set to 25% of available RAM. For a 64 GB database server, use16GB.effective_cache_size— Set to 50-75% of available RAM. This tells the query planner how much memory is available for caching, including OS cache.work_mem— Controls memory for sort operations and hash tables. Start with64MBand increase if you see disk-based sorts in query plans.maintenance_work_mem— Memory for VACUUM and index operations. Set to1GBor higher for large databases.max_wal_size— Controls checkpoint frequency. Increase to4GBor8GBto reduce checkpoint pressure under heavy write loads.random_page_cost— Set to1.1if your database runs on SSD storage (default is4.0, which is tuned for spinning disks).
postgresql.conf additions:
Indexing
Canton creates the necessary indexes during schema migration. Do not modify or drop Canton-managed indexes. If you observe slow queries in your PostgreSQL logs, check that autovacuum is running properly — bloated tables and stale statistics are the most common cause of query plan degradation.JVM tuning
Canton runs on the JVM. The default JVM settings are conservative, and production deployments benefit from explicit configuration.Heap size
Set the heap size based on the node type and expected workload:- Validator (participant) — Start with
-Xmx4g. Increase to 8-12 GB for high-throughput workloads or when hosting many parties. - Sequencer — Start with
-Xmx4g. The sequencer’s memory needs scale with message throughput. - Mediator — Start with
-Xmx2g. The mediator has lighter memory requirements than the sequencer or participant.
Garbage collection
G1GC is the recommended garbage collector for Canton. It provides good throughput with predictable pause times. Key settings:-XX:+UseG1GC— Enable G1 garbage collector-XX:MaxGCPauseMillis=200— Target maximum GC pause. Lower values reduce latency spikes but may reduce throughput.-XX:G1HeapRegionSize=16m— For heaps above 8 GB, increasing the region size improves G1 efficiency
-Xlog:gc*:file=/var/log/canton/gc.log:time,uptime,level,tags and watch for frequent full GC pauses, which indicate the heap is too small.
Sequencer throughput
The sequencer’s throughput is determined by its ordering backend and database performance. For the centralized (PostgreSQL) ordering backend (currently in Alpha):- The single database is the serialization point for all message ordering
- Vertical scaling of the database server (faster CPU, more IOPS) directly improves throughput
- Network latency between the sequencer process and its database affects every message
Traffic management
On the Global Synchronizer, every transaction consumes traffic, which is paid for with Canton Coin. To reduce traffic costs:- Batch operations — Submit multiple related commands together rather than individually. Canton processes batched commands more efficiently.
- Contract design — Smaller contracts and fewer contract creates/archives per transaction reduce traffic consumption.
- Synchronizer assignment — Move high-frequency bilateral workflows to a private synchronizer where no traffic fees apply.
Pruning
Canton stores a full history of transactions and ACS (Active Contract Set) snapshots. Over time, this data accumulates and can slow down queries. Pruning removes historical data that is no longer needed.Impact on performance
- Pruning reduces database size, which improves backup times and query performance
- The pruning process itself is resource-intensive — schedule it during low-traffic periods
- After pruning, run
VACUUM ANALYZEon the affected tables to reclaim disk space and update query statistics
Configuration
Batch sizes
Canton processes commands in batches internally. The default batch sizes balance latency and throughput. For high-throughput workloads, you can increase batch sizes:Monitoring performance
Track these metrics to identify bottlenecks:canton_participant_command_completion_latency (end-to-end command time), canton_sequencer_send_latency (sequencer throughput), canton_participant_db_query_latency (database health), hikaricp_connections_active (connection pool saturation), and JVM heap/GC metrics. Set up dashboards and alert when values exceed your baseline — performance degradation is usually gradual, and early detection prevents outages.
Document howto enable replication (on by default) on enterprise nodes with supported storage. Document health check configuration and fail-over times. Document admin commands to work with multiple replicas (find active replica), document commands to inspect activeness. For participant: load balancer configuration in front of gRPC Ledger API to route to active instance. Link to explanation on HA architecture.
High Availability Usage
This section looks at some of the components already mentioned and supplies useful Canton commands.Participant
High availability of a participant node is achieved by running multiple participant node replicas that have access to a shared database. Participant node replicas are configured in the Canton configuration file as individual participants with two required changes for each participant node replica:- Using the same storage configuration to ensure access to the shared database. Only PostgreSQL and Oracle-based storage is supported for HA. For Oracle it is crucial that the participant replicas use the same username to access the shared database.
- Set
replication.enabled = truefor each participant node replica.
Starting from Canton 2.4.0, participant replication is enabled by default when using supported storage.
Manual trigger of a fail-over
Fail-over from the active to a passive replica is done automatically when the active replica has a failure, but one can also initiate a graceful fail-over with the following command:Load balancer configuration
Many replicated participants can be placed behind an appropriately sophisticated load balancer that will by health checks determine which participant instance is active and direct ledger and admin api requests to that instance appropriately. This makes participant replication and failover transparent from the perspective of the ledger-api application or canton console administering the logical participant, as they will simply be pointed at the load balancer. Participants should be configured to expose an “IsActive” health status on our health HTTP server using the following monitoring configuration:/health if the Participant is currently the active replica. Otherwise, an error will be returned.
To use a load balancer it must support http/1 health checks for routing requests on a separate http/2 (GRPC) server. This is possible with HAProxy using the following example configuration:
global
log stdout format raw local0
defaults
log global
mode http
option httplog
enabled so long running connections are logged immediately upon connect
option logasapexpose the admin-api and ledger-api as separate servers
frontend admin-api bind :15001 proto h2 default_backend admin-api backend admin-apienable HTTP health checks
option httpchkrequired to create a separate connection to query the load balancer.
this is particularly important as the health HTTP server does not support h2
which would otherwise be the default.
http-check connectset the health check uri
http-check send meth GET uri /healthlist all participant backends
server participant1 participant1.lan:15001 proto h2 check port 8080 server participant2 participant2.lan:15001 proto h2 check port 8080 server participant3 participant3.lan:15001 proto h2 check port 8080repeat a similar configuration to the above for the ledger-api
frontend ledger-api bind :15000 proto h2 default_backend ledger-api backend ledger-api option httpchk http-check connect http-check send meth GET uri /health server participant1 participant1.lan:15000 proto h2 check port 8080 server participant2 participant2.lan:15000 proto h2 check port 8080 server participant3 participant3.lan:15000 proto h2 check port 8080 Add query cost logging.Optimize Storage
General Settings
Max Connection Settings
The storage configuration can further be tuned using the following additional setting:Z of the connections used by the exclusive sequencer writer component is the final parameter that can be controlled.
X + Y + 2 permanent connections with the database, whereas a synchronizer will use up to X permanent connections, except for a sequencer with HA setup that will allocate up to 2X connections. During startup, the node will use an additional set of at most X temporary connections during database initialisation.
The number X represents an upper bound of permanent connections and is divided internally for different purposes, depending on the implementation. Consequently, the actual size of the write connection pool, for example, could be smaller. Some of the allotted connections will be taken by the read pool, some will be taken by the Write pool, and a single additional connection will be reserved to a dedicated main connection responsible for managing the locking mechanism.
The following table summarizes the detailed split of the connection pools in different Canton nodes. R signifies a Read pool, W a Write pool, A a Ledger API pool, I an Indexer pool, RW a combined Read/Write pool, and M the Main pool.
| Node Type | Enterprise Edition with Replication | Enterprise Edition | Community Edition |
|---|---|---|---|
| Participant | A = X / 2 R = X / 4 W = X / 4 - 1 M = 1 I = Y | A = X / 2 R = X / 4 W = X / 4 - 1 M = 1 I = Y | A = X / 2 RW = X / 2 I = Y |
| Mediator | R = X / 2 W = X / 2 - 1 M = 1 | N/A | N/A |
| Sequencer | RW = X | N/A | N/A |
| Sequencer writer | R = X / 2 W = X / 2 - 1 M = 1 | N/A | N/A |
| Sequencer exclusive writer | R = Z / 2 W = Z / 2 | N/A | N/A |
| Synchronizer | N/A | RW = X | RW = X |
A, the Read R, the Write W pools.
R and W overwrites are added together to determine the overall pool size.
The effective connection pool sizes are reported by the Canton nodes at startup.
INFO c.d.c.r.DbStorageMulti$:participant=participant_b - Creating storage, num-reads: 5, num-writes: 4
Queue Size
Canton may schedule more database queries than the database can handle. As a result, these queries will be placed into the database queue. By default, the database queue has a size of 1000 queries. Reaching the queueing limit will lead to aDB_STORAGE_DEGRADATION warning. The impact of this warning is that the queuing will overflow into the asynchronous execution context and slowly degrade the processing, which will result in fewer database queries being created. However, for high-performance setups, such spikes might occur more regularly. Therefore, to avoid the degradation warning appearing too frequently, the queue size can be configured using:
Postgres
Postgres Configuration
For Postgres, the PGTune online tool is a good starting point for finding reasonable parameters (use online transaction processing system), but you need to increase the settings ofshared_buffers, checkpoint_timeout and max_wal_size, as explained below.
Beyond the initial configuration, note that most indexes Canton uses are “hash based”. Therefore, read and write access to these indexes is uniformly distributed. However, Postgres reads and writes indexes in pages of 8kb, while a simple index might only be a couple of writes. Therefore, it is very important to be able to keep the indexes in memory and only write updates to the disk from time to time; otherwise, a simple change of 32 bytes requires 8kb I/O operations.
Configuring the shared_buffers setting to hold 60-70% of the host memory is recommended, rather than the default suggestion of 25%, as the Postgres caching appears to be more effective than the host-based file access caching.
Also increase the following variables beyond their default: Increase the checkpoint_timeout so that the flushing to disk includes several writes and not just one per page, accumulated over time, together with a higher max_wal_size to ensure that the system does not prematurely flush before reaching the checkpoint_timeout. Monitor your system during load testing and tune the parameters accordingly to your use case. The downside of changing the checkpointing parameters is that crash recovery takes longer.
Sizing and Performance
Note that your Postgres database setup requires appropriate tuning to achieve the desired performance. Canton is database-heavy. This section should give you a starting point for your tuning efforts. You may want to consult the troubleshooting section on how to analyze whether the database is a limiting factor. This guide can give you a starting point for tuning. Ultimately, every use case is different and the exact resource requirements cannot be predicted, but have to be measured. First, ensure that the database you are using is appropriately sized for your use case. The number of cores depends on your throughput requirements. The rule of thumb is:- 1 db core per 1 participant core.
- 1 participant core for 30-100 ledger events per second (depends on the complexity of the commands).