Performance


Fluxtion can generate ahead of time high performance event processors that are suitable for the most demanding low latency environments. This section documents the performance results for a test project.

Jmh is used to measure throughput per second and latency for a single calculation cycle.

HdrHistogram is used to record latency percentiles for an execution run

Results are in the nanosecond range as Fluxtion operates with sub-microsecond response times

Summary results

  • 33 million events processed per second
  • Average latency is 30 nanoseconds to process one event including app logic
  • The Fluxtion event processor is in the low nanosecond range for event processing overhead
  • Zero gc
  • Single threaded application

Test setup

The test project process a market update and carries out a set of calculation for each market data price ladder event:

public class PriceLadder {
    private final int[] bidSizes = new int[5];
    private final int[] bidPrices = new int[5];
    private final int[] askSizes = new int[5];
    private final int[] askPrices = new int[5];
}
  • A graph with four nodes each performing some calculations
  • A set of 10,000 randomly generated data events is passed into the processor for processing
  • Single threaded
  • No core pinning or isolation
  • AOT generated version of an event processor is under test

The goal is to create a representative test with randomly distributed input data set to remove bias in the results.

Node description

Eval order Class Description
Node 1 MidCalculator mid price calculator
Node 2 SkewCalculator Skew calculation, move all prices at each level by the skew amount
Node 3 LevelsCalculator Remove levels by setting to zero for price and volume if max levels < input ladder level count
Node 4 PriceLadderPublisher Publish the calculated PriceLadder to a consumer for distribution

Jmh results

Benchmark                                               Mode  Cnt         Score   Error  Units
PriceLadderBenchmark.throughPut_BranchingProcessor     thrpt    2  33315147.680          ops/s
PriceLadderBenchmark.throughPut_No_BranchingProcessor  thrpt    2  34325482.879          ops/s
PriceLadderBenchmark.avgTime_BranchingProcessor         avgt    2        29.309          ns/op
PriceLadderBenchmark.avgTime_No_BranchingProcessor      avgt    2        30.614          ns/op

The average time to process an event is 30 nanoseconds, this includes the application logic invoked for each node.

Latency distribution

At 99.99% the latency is measured at 0.625 microseconds, which includes the underlying machine jitter. The results are recorded using HdrHistogram which adds an overhead of several nanoseconds to each reading. The tail of the app latency follows the tail for jitter on the test machine with no app code executing, this tells us the jitter at the higher end is due to the machine and not the application.

  • red line - total latency
  • blue line - machine jitter with no processing

Table of contents