Introduction
Recently I published on GitHub new library: json-path-mapper. In shortcut it is library to mapping custom Json fields to java data object. Main API have two map methods: the first map every field sequential, the second map parallel. So it’s important to check when first method works better and when second.
Using JMH
JMH (Java Microbenchmark Harness) is a benchmark library for JVM applications. Using it we can e.g. measure time of execution our code. JMH is quiet easy to use. When we have Gradle project it’s even easier. The only think with we must add is jmh-gradle-plugin. On plugin repository site is complete instruction for installation and configuration this plugin, so I don’t repeat it here. You can also see example of complete configuration of this plugin in my library.
Making benchmark tests in Java
As I mentioned above I want to check performance for two versions of mapping algorithm: sequential and parallel. The results can be different depend on map fields amount and fields validation and mapper expensive, so I must test various combinations.
The complete benchmark test:
package pl.dmarciniak.jsonpathmapper.benchmark; import org.openjdk.jmh.annotations.*; import pl.dmarciniak.jsonpathmapper.FieldMapper; import pl.dmarciniak.jsonpathmapper.JsonPathMapper; import pl.dmarciniak.jsonpathmapper.JsonPathMapperBuilder; import pl.dmarciniak.jsonpathmapper.benchmark.data.helper.JmhResourceLoader; import java.util.concurrent.TimeUnit; import java.util.concurrent.atomic.AtomicInteger; import java.util.function.Function; import static org.assertj.core.api.Assertions.*; @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.MILLISECONDS) @State(Scope.Benchmark) @Fork(1) public class MappingManyValuesBenchmark { @Param({"5", "10", "50", "100", "500", "1000"}) private int valuesAmount; @Param({"0", "1"}) private int fieldMappingTimeMs; private String json; private JsonPathMapper<Integer> mapper; private Function<Integer, Integer> fieldMapperEmulator; @Setup public void before() { json = JmhResourceLoader.load("json/big.json"); fieldMapperEmulator = (i) -> { try { Thread.sleep(fieldMappingTimeMs); } catch (InterruptedException e) { throw new RuntimeException(e); } return i; }; JsonPathMapperBuilder<AtomicInteger> builder = JsonPathMapper.forClass(AtomicInteger.class).initialize(AtomicInteger::new); for (int i = 1; i <= valuesAmount; ++i) { builder.mapField(FieldMapper.fromPath("$.test.i" + i, Integer.class).toGetterField(AtomicInteger::addAndGet).withMapper(fieldMapperEmulator)); } mapper = builder.buildWithResultMapper(AtomicInteger::get); assertThat(mapper.map(json)).isEqualTo(((1 + valuesAmount) * valuesAmount) / 2); assertThat(mapper.parallelMap(json)).isEqualTo(((1 + valuesAmount) * valuesAmount) / 2); } @Benchmark public Integer sequentialMap() { return mapper.map(json); } @Benchmark public Integer parallelMap() { return mapper.parallelMap(json); } }
As you see, writing benchmark tests is quite easy. Similarly to unit tests frameworks, the JMH also provides specific annotations to configure benchmark tests. I describe some of them.
@Param annotation inject values to variables. In my example I want to check various numbers of map operations (valuesAmount) and check performance for different mapper algorithm expensive (fieldMappingTimeMs).
Method with @Setup annotation is running before every test. It’s good place to initialize some values. I also test here whether mapping algorithms work fine, because I don’t want do that in benchmark tests.
Methods with tests are annotated @Benchmark. It’s a good practice that these methods returns result of tests. It’s protect before JVM code optimization.
To run benchmark tests we only must run the command:
./gradlew clean jmh
Above command run 24 benchmark tests (6 various of valuesAmount * 2 various of fieldMappingTimeMs * 2 tests ). The whole test takes about 40 minutes. And the result is something like this:
Benchmark (fieldMappingTimeMs) (valuesAmount) Mode Cnt Score Error Units MappingManyValuesBenchmark.parallelMap 0 5 avgt 5 0.146 ± 0.014 ms/op MappingManyValuesBenchmark.parallelMap 0 10 avgt 5 0.148 ± 0.011 ms/op MappingManyValuesBenchmark.parallelMap 0 50 avgt 5 0.223 ± 0.001 ms/op MappingManyValuesBenchmark.parallelMap 0 100 avgt 5 0.391 ± 0.006 ms/op MappingManyValuesBenchmark.parallelMap 0 500 avgt 5 2.789 ± 0.016 ms/op MappingManyValuesBenchmark.parallelMap 0 1000 avgt 5 10.261 ± 0.042 ms/op MappingManyValuesBenchmark.parallelMap 1 5 avgt 5 1.791 ± 0.162 ms/op MappingManyValuesBenchmark.parallelMap 1 10 avgt 5 2.895 ± 0.019 ms/op MappingManyValuesBenchmark.parallelMap 1 50 avgt 5 8.680 ± 0.083 ms/op MappingManyValuesBenchmark.parallelMap 1 100 avgt 5 17.564 ± 0.137 ms/op MappingManyValuesBenchmark.parallelMap 1 500 avgt 5 88.303 ± 2.604 ms/op MappingManyValuesBenchmark.parallelMap 1 1000 avgt 5 179.305 ± 4.142 ms/op MappingManyValuesBenchmark.sequentialMap 0 5 avgt 5 0.119 ± 0.002 ms/op MappingManyValuesBenchmark.sequentialMap 0 10 avgt 5 0.123 ± 0.001 ms/op MappingManyValuesBenchmark.sequentialMap 0 50 avgt 5 0.171 ± 0.001 ms/op MappingManyValuesBenchmark.sequentialMap 0 100 avgt 5 0.256 ± 0.002 ms/op MappingManyValuesBenchmark.sequentialMap 0 500 avgt 5 0.694 ± 0.005 ms/op MappingManyValuesBenchmark.sequentialMap 0 1000 avgt 5 1.328 ± 0.018 ms/op MappingManyValuesBenchmark.sequentialMap 1 5 avgt 5 6.253 ± 0.029 ms/op MappingManyValuesBenchmark.sequentialMap 1 10 avgt 5 12.060 ± 0.034 ms/op MappingManyValuesBenchmark.sequentialMap 1 50 avgt 5 59.625 ± 0.403 ms/op MappingManyValuesBenchmark.sequentialMap 1 100 avgt 5 119.110 ± 1.518 ms/op MappingManyValuesBenchmark.sequentialMap 1 500 avgt 5 614.544 ± 244.647 ms/op MappingManyValuesBenchmark.sequentialMap 1 1000 avgt 5 1175.224 ± 86.706 ms/op
Analysis and conclusions
To better illustration the result I organized data in two tables.
Low expensive field mapper algorithm:
Amount of map fields | Time of sequential map | Time of parallel map |
5 | ~ 0.119 ms | ~ 0.146 ms |
10 | ~ 0.123 ms | ~ 0.148 ms |
50 | ~ 0.171 ms | ~ 0.223 ms |
100 | ~ 0.256 ms | ~ 0.391 ms |
500 | ~ 0.694 ms | ~ 2.789 ms |
1000 | ~ 1.328 ms | ~ 10.261 ms |
High expensive field mapper algorithm:
Amount of map fields | Time of sequential map | Time of parallel map |
5 | ~ 6.253 ms | ~ 1.791 ms |
10 | ~ 12.060 ms | ~ 2.895 ms |
50 | ~ 59.625 ms | ~ 8.680 ms |
100 | ~ 119.110 ms | ~ 17.564 ms |
500 | ~ 614.544 ms | ~ 88.303 ms |
1000 | ~ 1175.224 ms | ~ 179.305 ms |
The results little surprising me. Sequential algorithm work better even if we map many fields. Additionally above 100 fields the parallel algorithm has serious problem with performance. Only when we have expensive field mapper or validation algorithm then parallel map can be better. It’s very important hint and I can add this conclusion to library documentation.
So benchmark tests have been very helpful and it’s a good lesson that it’s better to test something than guess.
The complete example you can see in json-path-mapper library repository.
Recent Comments