Skip to content

Compression Choices

William Silversmith edited this page Mar 24, 2023 · 5 revisions

CloudVolume has many different codecs to choose from for each compression type. Here is a short guide (to be improved upon) to give some guidance on which one to choose.

Some encodings can be layered with a second stage bitstream compression. We support gzip and brotli (br) mainly because that is what browsers (and hence Neuroglancer) automatically support. It is possible in the future to add support for e.g. zstd but Neuroglancer would have to have a codec for it. Note that brotli is not supported for sharded data currently (Neuroglancer only has a gzip decompression JS module).

EM Images

Generally grayscale 8 or 16 bit electron or light microscopy images.

Choices: raw, raw+gzip, raw+br, png, jpeg

  • If you can tolerate lossy compression, jpeg will be very fast and give the best compression.
  • PNG will give the best lossless compression by about 25% but at the expense of speed.
  • raw+gzip and raw+br have slightly different performance profiles but will give similar compression at the default settings.
  • raw means uncompressed. Very fast on SSD, not so much on remote networks. Untenable for large datasets.
  • jpeg does not support 16-bit images (it technically does, but requires special recompilation of the library so no).

Segmentation

These are usually uint32 or uint64 densely labeled data.

Choices: raw, compressed_segmentation (cseg), compresso (all +gzip or +br), crackle

  • For smooth segmentation, generally go with compresso+br for the best compression ratio and almost top performance. crackle+br gives superior compression to compresso, but somewhat worse performance and is experimental.
  • For noisy segmentation, go with cseg+br for the best compression and top performance.
  • If you use crackle, please communicate with Will Silversmith.

Compresso, crackle, and cseg are codecs designed for connectomics data. Crackle and compresso are novel high compression codecs.

Voxel-Wise Affinities

Intermediate float32 xyz neighbor affinity predictions used for creating segmentation and region graphs. These are very heavy, 12x bigger than the base image. More information: https://github.com/seung-lab/cloud-volume/wiki/Advanced-Topic:-fpzip-and-kempressed-Encodings

Choices: raw, raw+gz, raw+br, fpzip, kempressed

  • Use kempressed for best compression.
  • Note that the official Neuroglancer client cannot display fpzip or kempressed, so you'll have to use raw+X if that's a requirement.

Alignment Vectors

These are usually float32 images with an X and Y component. Some older versions are int16 to which this advice does not apply.

Choices: raw, raw+gzip, raw+br, fpzip, zfpc

  • The current best choice is to use raw+br
  • zfpc is an experimental lossy compression choice that will likely be the go-to option in the future. Don't pick it for now unless you are in communication with Will Silversmith

Visualizing Experimental Codecs

Experimental Codecs: fpzip, kempressed, crackle, and zfpc

These codecs are not integrated into mainline Neuroglancer. However, you can visualize them using a Neuroglancer fork.