Skip to content

grlee77/uskimage-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

General Overview

This repository is a proof-of-concept demo for different concrete backend implementations for scikit-image. It is not intended for use by downstream projects. In practice, such backends would ideally be maintained by their upstream projects (cuCIM, etc.) so that they can be tested, ensuring that they stay in sync.

This respository is compatible with a specific uarray development branch of scikit-image which implements uarray multimethods for the majority of the scikit-image API. Most of the multimethods were autogenerated using a script provided in the tools directory here. That branch also differs from scikit-image main in that some currently deprecated arguments have already been removed. Specifically, multichannel was removed in favor of channel_axis and selem was removed in favor of footprint.

Installation

An editable development install can be performed by checking out the repository and using:

pip install -e . -v

Runtime requirements:

- NumPy >= 1.17
- CuPy >= 9
- cuCIM >= 21.08
- uarray >= 0.8.2
- dask[array] >= 2.0
- scikit-image from branch https://github.com/grlee77/scikit-image/tree/uarray

Featured Backends

A small subset of functions such as skimage.filters.gaussian, skimage.filters.median and skimage.morphology.binary_erosion are implemented for multiple different backends. Specifically the following backends are provided:

  • uskimage_demo.dask_backend - This backend accepts Dask arrays as input and returns Dask futures.

  • uskimage_demo.dask_numpy_backend - This backend shares the multimethod implementations of dask_backend, but has two differences in behavior: 1.) It automatically will convert numpy.array inputs to dask arrays with a suitable chunk size. 2.) It calls compute() on the function ouptuts so that NumPy arrays are returned instead of dask futures.

  • uskimage_demo.diplib_backend - This backend calls corresponding functions from the DIPlib library. DIPlib provides efficient, multithreaded C++ implementations with Python wrappers.

  • uskimage_demo.cucim_backend - This backend accepts CuPy GPU array inputs and calls the corresponding GPU-based implementations from cuCIM.

  • uskimage_demo.cucim_cpu_backend - This backend is a wrapper around cucim_cpu that handles host/device transfers such that NumPy array inputs will be transfered to the GPU and the function outputs will be transfered back to NumPy arrays on the host.

Other possible backends to explore

SimpleITK is another popular library supporting nD images that could likely provide a backend for some functions.

OpenCV or scikit-ipp could also be used. While fast, these are more restricted on the dtypes supported and I think are typically for 2D images only.

Benchmarking results

Running the included demo/backends_demo.py shows that the provided backends can provide substantial acceleration over scikit-image's own implementations on a single node workstation with a 10-core/20-thread CPU (Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz). For the cuCIM-based backends, computations ran on an NVIDIA GTX 1080 Ti GPU.

In most cases there is a hierarchy of performance among the backends as follows.

cucim_backend > cucim_cpu_backend > diplib_backend > dask_backend > scikit-image

It should be noted, though, that the above is for use of a single node whereas the Dask case could potentially be extended to also distribute work across multiple nodes.

Import caveat

For this demo we intentionally did not implement a Dask or DIPlib multimethod for either difference_of_gaussian or structural_similarity. Note, however that difference_of_guassian still shows acceleration with both of these backends because this function calls gaussian twice internally and gaussian does have an implementation in the backend! The case is similar for structural_similarity, which also uses gaussian internally. However, in this case the boundary extension mode used is not present in DIPlib and so the call to gaussian falls back to the standard scikit-image implementation and no acceleration is observed.

Also, for binary erosion with somewhat large footprints as used here, we have pending improvements enabling footprint decomposition that will give faster results for both the scikit-image and cucim backends relative to what is presented here.

2D results

Running with 2D data:
    For gaussian, median and structural_similarity functions:
        image.shape=(4096, 2048)
        image.dtype=float32
    For binary erosion:
        binary blobs shape=(2048, 2048)
    ball footprint shape = (11, 11)
    rect footprint shape = (11, 11)


** Timings with default skimage backend**
gaussian                 :    CPU:  358.497 ms   +/- 2.816 (min:  354.936 / max:  362.594) ms
difference_of_gaussians  :    CPU:  627.612 ms   +/-21.027 (min:  609.264 / max:  663.219) ms
median (ball)            :    CPU: 7685.304 ms
median (rect)            :    CPU:11060.581 ms
erosion (ball)           :    CPU:  147.811 ms   +/- 2.450 (min:  142.464 / max:  150.965) ms
erosion (rect)           :    CPU:  233.441 ms   +/-32.597 (min:  214.944 / max:  317.526) ms
structural sim.          :    CPU: 1480.584 ms   +/-12.704 (min: 1467.881 / max: 1493.288) ms


** Timings with Dask backend enabled (CPU, scheduler='threads')**
gaussian                 :    CPU:  198.221 ms   +/-17.518 (min:  160.228 / max:  232.355) ms
difference_of_gaussians  :    CPU:  238.046 ms   +/-14.382 (min:  213.749 / max:  265.874) ms
median (ball)            :    CPU:  888.443 ms   +/-27.000 (min:  864.415 / max:  926.157) ms
median (rect)            :    CPU: 1169.550 ms   +/-10.241 (min: 1159.309 / max: 1179.790) ms
erosion (ball)           :    CPU:  141.625 ms   +/- 8.409 (min:  125.899 / max:  153.718) ms
erosion (rect)           :    CPU:  163.532 ms   +/-21.080 (min:  142.507 / max:  231.082) ms
structural sim.          :    CPU: 1076.996 ms   +/-64.082 (min: 1012.913 / max: 1141.078) ms


** Timings with diplib backend enabled (threads=20)**
gaussian                 :    CPU:   58.320 ms   +/-21.611 (min:   40.461 / max:  100.524) ms
difference_of_gaussians  :    CPU:  167.277 ms   +/-20.296 (min:  130.111 / max:  198.398) ms
median (ball)            :    CPU:  666.159 ms   +/-20.454 (min:  645.942 / max:  698.992) ms
median (rect)            :    CPU:  938.623 ms   +/-32.421 (min:  896.971 / max:  976.050) ms
erosion (ball)           :    CPU:   22.463 ms   +/- 6.103 (min:    6.640 / max:   39.190) ms
erosion (rect)           :    CPU:   50.127 ms   +/-20.970 (min:    6.477 / max:   74.148) ms
structural sim.          :    CPU: 1461.264 ms   +/-20.882 (min: 1440.382 / max: 1482.146) ms


** Timings with cucim_cpu backend **
gaussian                 :    CPU:   25.767 ms   +/- 0.138 (min:   25.459 / max:   26.201) ms
difference_of_gaussians  :    CPU:   29.383 ms   +/- 0.103 (min:   29.149 / max:   29.596) ms
median (ball)            :    CPU:  280.674 ms   +/- 0.208 (min:  280.395 / max:  280.987) ms
median (rect)            :    CPU:  550.639 ms   +/- 0.307 (min:  550.211 / max:  550.992) ms
erosion (ball)           :    CPU:    6.330 ms   +/- 0.021 (min:    6.286 / max:    6.404) ms
erosion (rect)           :    CPU:    2.486 ms   +/- 0.016 (min:    2.445 / max:    2.542) ms
structural sim.          :    CPU:   42.156 ms   +/- 0.091 (min:   42.009 / max:   42.390) ms


** Timings with cucim backend (no host/device transfer) **
gaussian                 :    GPU-0:   10.232 ms   +/- 0.025 (min:   10.199 / max:   10.329) ms
difference_of_gaussians  :    GPU-0:   13.890 ms   +/- 0.019 (min:   13.853 / max:   13.953) ms
median (ball)            :    GPU-0:  265.037 ms   +/- 0.210 (min:  264.738 / max:  265.526) ms
median (rect)            :    GPU-0:  534.763 ms   +/- 0.259 (min:  534.431 / max:  535.066) ms
erosion (ball)           :    GPU-0:    4.986 ms   +/- 0.017 (min:    4.966 / max:    5.206) ms
erosion (rect)           :    GPU-0:    1.153 ms   +/- 0.010 (min:    1.136 / max:    1.232) ms
structural sim.          :    GPU-0:   32.278 ms   +/- 0.022 (min:   32.220 / max:   32.359) ms

3D results

Running with 3D data:
    For filters and measurements functions:
        image.shape=(256, 256, 128)
        image.dtype=float32
    For binary morphology:
        binary blobs shape=(256, 256, 256)
        ball footprint shape = (7, 7, 7)
        rect footprint shape = (7, 7, 7)


** Timings with default skimage backend**
gaussian                 :    CPU:  470.269 ms   +/- 3.526 (min:  465.000 / max:  474.272) ms
difference_of_gaussians  :    CPU:  762.822 ms   +/- 2.910 (min:  758.849 / max:  765.737) ms
median (ball)            :    CPU:11378.148 ms
median (rect)            :    CPU:29591.795 ms
erosion (ball)           :    CPU:  753.432 ms   +/- 9.170 (min:  740.725 / max:  762.029) ms
erosion (rect)           :    CPU: 1507.758 ms   +/- 5.866 (min: 1501.892 / max: 1513.625) ms
structural sim.          :    CPU: 1752.775 ms   +/- 9.866 (min: 1742.909 / max: 1762.641) ms


** Timings with Dask backend enabled (CPU, scheduler='threads')**
gaussian                 :    CPU:  290.269 ms   +/-22.404 (min:  259.381 / max:  323.366) ms
difference_of_gaussians  :    CPU:  330.863 ms   +/-31.330 (min:  291.124 / max:  389.668) ms
median (ball)            :    CPU: 1458.169 ms   +/-17.291 (min: 1440.877 / max: 1475.460) ms
median (rect)            :    CPU: 3468.485 ms
erosion (ball)           :    CPU:  725.926 ms   +/-43.245 (min:  692.958 / max:  787.020) ms
erosion (rect)           :    CPU:  725.291 ms   +/-51.361 (min:  659.213 / max:  784.449) ms
structural sim.          :    CPU: 1284.369 ms   +/-16.837 (min: 1267.532 / max: 1301.207) ms


** Timings with diplib backend enabled (threads=20)**
gaussian                 :    CPU:  102.941 ms   +/-14.279 (min:   68.722 / max:  135.538) ms
difference_of_gaussians  :    CPU:  198.852 ms   +/-20.036 (min:  173.322 / max:  227.423) ms
median (ball)            :    CPU:  931.723 ms   +/-27.284 (min:  895.307 / max:  960.978) ms
median (rect)            :    CPU: 2003.327 ms
erosion (ball)           :    CPU:   82.589 ms   +/-10.962 (min:   71.367 / max:  114.855) ms
erosion (rect)           :    CPU:   20.668 ms   +/- 2.455 (min:   17.455 / max:   37.924) ms
structural sim.          :    CPU: 1688.337 ms   +/- 5.981 (min: 1682.355 / max: 1694.318) ms


** Timings with cucim_cpu backend **
gaussian                 :    CPU:   31.280 ms   +/- 0.768 (min:   30.856 / max:   34.549) ms
difference_of_gaussians  :    CPU:   36.145 ms   +/- 0.818 (min:   35.816 / max:   40.588) ms
median (ball)            :    CPU:  534.166 ms   +/- 0.212 (min:  533.818 / max:  534.389) ms
median (rect)            :    CPU: 2755.902 ms
erosion (ball)           :    CPU:   67.111 ms   +/- 0.049 (min:   67.013 / max:   67.253) ms
erosion (rect)           :    CPU:   19.619 ms   +/- 0.071 (min:   19.481 / max:   20.001) ms
structural sim.          :    CPU:   51.924 ms   +/- 0.100 (min:   51.713 / max:   52.213) ms


** Timings with cucim backend (no host/device transfer) **
gaussian                 :    GPU-0:   15.000 ms   +/- 0.025 (min:   14.975 / max:   15.210) ms
difference_of_gaussians  :    GPU-0:   20.051 ms   +/- 0.080 (min:   19.974 / max:   20.469) ms
median (ball)            :    GPU-0:  528.453 ms   +/- 0.152 (min:  528.318 / max:  528.710) ms
median (rect)            :    GPU-0: 2743.274 ms
erosion (ball)           :    GPU-0:   62.290 ms   +/- 0.054 (min:   62.195 / max:   62.407) ms
erosion (rect)           :    GPU-0:   14.698 ms   +/- 0.064 (min:   14.482 / max:   14.863) ms
structural sim.          :    GPU-0:   42.686 ms   +/- 0.032 (min:   42.646 / max:   42.861) ms

More info/discussion on backend approaches

At the time of writing, there is an ongoing discussion across scientific Python projects about how to unify on approaches to dispatching. Others interested in this topic are encouraged to participate in the conversations related to this at the scientific-python forum:

About

proof-of-concept uarray + scikit-image demo

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages