9/15/2017

Oodle tuneability with space-speed tradeoff

Oodle's modern encoders take a parameter called the "space-speed tradeoff". (specifically OodleLZ_CompressOptions:: spaceSpeedTradeoffBytes).

"speed" here always refers to decode speed - this is about the encoder making choices about how it forms the compressed bit stream.

This parameter allows the encoders to make decisions that optimize for a space-speed goal which is of your choosing. You can make those decisions favor size more, or you can favor decode speed more.

If you like, a modern compressor is a bit a like a compiler. The compressed data is a kind of program in bytecode, and the decompressor is just an intepreter that runs that bytecode. An optimal parser is like an optimizing compiler; you're considering different programs that produce the same output, and trying to find the program that maximizes some metric. The "space-speed tradeoff" parameter is a bit like -Ox vs -Os, optimize for speed vs size in a compiler.

Oodle of course includes Hydra (the many headed beast) which can tune performance by selecting compressors based on their space-speed performance.

But even without Hydra the individual compressors are tuneable, none more so than Mermaid. Mermaid can stretch itself from Selkie-like (LZ4 domain) up to standard LZH compression (ZStd domain).

I thought I would show an example of how flexible Mermaid is. Here's Mermaid level 4 (Normal) with some different space-speed tradeoff parameters :


sstb = space speed tradeoff bytes

sstb 32 :  ooMermaid4  :  2.29:1 ,   33.6 enc mbps , 1607.2 dec mbps
sstb 64 :  ooMermaid4  :  2.28:1 ,   33.8 enc mbps , 1675.4 dec mbps
sstb 128:  ooMermaid4  :  2.23:1 ,   34.1 enc mbps , 2138.9 dec mbps
sstb 256:  ooMermaid4  :  2.19:1 ,   33.9 enc mbps , 2390.0 dec mbps
sstb 512:  ooMermaid4  :  2.05:1 ,   34.3 enc mbps , 2980.5 dec mbps
sstb 1024: ooMermaid4  :  1.89:1 ,   34.4 enc mbps , 3637.5 dec mbps

compare to : (*)

zstd9       :  2.18:1 ,   37.8 enc mbps ,  590.2 dec mbps
lz4hc       :  1.67:1 ,   29.8 enc mbps , 2592.0 dec mbps

(* MSVC build of ZStd/LZ4 , not a fair speed measurement (they're faster in GCC), just use as a general reference point)

Point being - not only can Mermaid span a large range of performance but it's *good* at both ends of that range, it's not getting terrible as it out of its comfort zone.

You may notice that as sstb goes below 128 you're losing a lot of decode speed and not gaining much size. The problem is you're trying to squeeze a lot of ratio out of a compressor that just doesn't target high ratio. As you get into that domain you need to switch to Kraken. That is, there comes a point where the space-speed benefit of squeezing the last drop out of Mermaid is harder than just making the jump to Kraken. And that's where Hydra comes in, it will do that for you at the right spot.

ADD : Put another way, in Oodle there are *two* speed-ratio tradeoff dials. Most people are just familiar with the compression "level" dial, as in Zip, where higher levels = slower to encode, but more compression ratio. In Oodle you have that, but also a dial for decode time :


CompressionLevel = trade off encode time for compression ratio

SpaceSpeedTradeoffBytes = trade off decode time for compression ratio

Perhaps I'll show some sample use cases :

Default initial setting :

CompressionLevel = Normal (4)
SpaceSpeedTradeoffBytes = 256

Reasonably fast encode & decode.  This is a balance between caring about encode time, decode time,
and compression ratio.  Tries to do a decent job of all 3.

To maximize compression ratio, when you don't care about encode time or decode time :

CompressionLevel = Optimal4 (8)
SpaceSpeedTradeoffBytes = 1

You want every possible byte of compression and you don't care how much time it costs you to encode or
decode.  In practice this is a bit silly, rather like the "placebo" mode in x264.  You're spending
potentially a lot of CPU time for very small gains.

A more reasonable very high compression setting :

CompressionLevel = Optimal3 (7)
SpaceSpeedTradeoffBytes = 16

This still says you strongly value ratio over encode time or decode time, but you don't want to chase
tiny gains in ratio that cost a huge amount of decode time.

If you care about decode time but not encode time :

CompressionLevel = Optimal4 (8)
SpaceSpeedTradeoffBytes = 256

Crank up the encode level to spend lots of time making the best possible compressed stream, but make
decisions in the encoder that balance decode time.

etc.

The SpaceSpeedTradeoffBytes is a number of bytes that Oodle must be able to save in order to accept a certain time increase in the decoder. In Kraken that unit of time is 25600 cycles on the artifical machine model that we use. (that's 8.53 microseconds at 3 GHz). So at the default value of 256, it must save 1 byte in compressed size to take an increased time of 100 cycles.

No comments:

old rants