Add multithreaded support in the DWT encoder. 1248/head
authorEven Rouault <even.rouault@spatialys.com>
Thu, 30 Apr 2020 09:52:42 +0000 (11:52 +0200)
committerEven Rouault <even.rouault@spatialys.com>
Wed, 20 May 2020 18:30:21 +0000 (20:30 +0200)
commit07d1f775a1ef95496b0c78b18f671dac41983320
tree6e69f9d1e92244c2fc1ec4d3b9b4975e7b37b6c4
parent97eb7e0bf17b476d516262e0af462ec7eeb8f505
Add multithreaded support in the DWT encoder.

Update the bench_dwt utility to have a -decode/-encode switch

Measured performance gains for DWT encoder on a
Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz (4 cores, hyper threaded)

Encoding time:
$ ./bin/bench_dwt -encode -num_threads 1
time for dwt_encode: total = 8.348 s, wallclock = 8.352 s

$ ./bin/bench_dwt -encode -num_threads 2
time for dwt_encode: total = 9.776 s, wallclock = 4.904 s

$ ./bin/bench_dwt -encode -num_threads 4
time for dwt_encode: total = 13.188 s, wallclock = 3.310 s

$ ./bin/bench_dwt -encode -num_threads 8
time for dwt_encode: total = 30.024 s, wallclock = 4.064 s

Scaling is probably limited by memory access patterns causing
memory access to be the bottleneck.
The slightly worse results with threads==8 than with thread==4
is due to hyperthreading being not appropriate here.
CMakeLists.txt
src/lib/openjp2/CMakeLists.txt
src/lib/openjp2/bench_dwt.c
src/lib/openjp2/dwt.c
src/lib/openjp2/dwt.h
src/lib/openjp2/tcd.c
tools/ctest_scripts/travis-ci.cmake