diff --git a/.github/API_DESIGN.md b/.github/API_DESIGN.md index 797504539b..0f599b5999 100644 --- a/.github/API_DESIGN.md +++ b/.github/API_DESIGN.md @@ -2,34 +2,34 @@ In general, KerasCV abides to the [API design guidelines of Keras](https://github.com/keras-team/governance/blob/master/keras_api_design_guidelines.md). -There are a few API guidelines that apply only to KerasCV. These are discussed +There are a few API guidelines that apply only to KerasCV. These are discussed in this document. # Label Names When working with `bounding_box` and `segmentation_map` labels the -abbreviations `bbox` and `segm` are often used. In KerasCV, we will *not* be -using these abbreviations. This is done to ensure full consistency in our -naming convention. While the team is fond of the abbreviation `bbox`, we are -less fond of `segm`. In order to ensure full consistency, we have decided to +abbreviations `bbox` and `segm` are often used. In KerasCV, we will *not* be +using these abbreviations. This is done to ensure full consistency in our +naming convention. While the team is fond of the abbreviation `bbox`, we are +less fond of `segm`. In order to ensure full consistency, we have decided to use the full names for label types in our code base. # Preprocessing Layers ## Strength Parameters Many augmentation layers take a parameter representing a strength, often called -`factor`. When possible, factor values must conform to a the range: `[0, 1]`, with +`factor`. When possible, factor values must conform to the range: `[0, 1]`, with 1 representing the strongest transformation and 0 representing a no-op transform. -The strength of an augmentation should scale linearly with this factor. If needed, -a transformation can be performed to map to a large value range internally. If +The strength of an augmentation should scale linearly with this factor. If needed, +a transformation can be performed to map to a large value range internally. If this is done, please provide a thorough explanation of the value range semantics in the docstring. -Additionally, factors should support both float and tuples as inputs. If a float is +Additionally, factors should support both float and tuples as inputs. If a float is passed, such as `factor=0.5`, the layer should default to the range `[0, factor]`. ## BaseImageAugmentationLayer When implementing preprocessing, we encourage users to subclass the -`keras_cv.layers.preprocessing.BaseImageAugmentationLayer`. This layer provides - a common `call()` method, auto vectorization, and more. +`keras_cv.layers.preprocessing.BaseImageAugmentationLayer`. This layer provides + a common `call()` method, auto vectorization, and more. When subclassing `BaseImageAugmentationLayer`, several methods can overridden: @@ -41,20 +41,20 @@ When subclassing `BaseImageAugmentationLayer`, several methods can overridden: ## Vectorization `BaseImageAugmentationLayer` requires you to implement augmentations in an -image-wise basis instead of using a vectorized approach. This design choice +image-wise basis instead of using a vectorized approach. This design choice was based made on the results found in the [vectorization\_strategy\_benchmark.py](../benchmarks/vectorization_strategy_benchmark.py) benchmark. In short, the benchmark shows that making use of `tf.vectorized_map()` performs -almost identically to a manually vectorized implementation. As such, we have +almost identically to a manually vectorized implementation. As such, we have decided to rely on `tf.vectorized_map()` for performance. ![Results of vectorization strategy benchmark](images/runtime-plot.png) ## Color Based Preprocessing Layers -Some preprocessing layers in KerasCV perform color based transformations. This -includes `RandomBrightness`, `Equalize`, `Solarization`, and more. +Some preprocessing layers in KerasCV perform color based transformations. This +includes `RandomBrightness`, `Equalize`, `Solarization`, and more. Preprocessing layers that perform color based transformations make the following assumptions: @@ -63,10 +63,10 @@ following assumptions: - input images may be of any `dtype` The decision to support inputs of any `dtype` is made based on the nuance that -some Keras layers cast user inputs without the user knowing. For example, if +some Keras layers cast user inputs without the user knowing. For example, if `Solarization` expected user inputs to be of type `int`, and a custom layer was accidentally casting inputs to `float32`, it would be a bad user experience -to raise an error asserting that all inputs must be of type `int`. +to raise an error asserting that all inputs must be of type `int`. New preprocessing layers should be consistent with these decisions. diff --git a/.github/CALL_FOR_CONTRIBUTIONS.md b/.github/CALL_FOR_CONTRIBUTIONS.md index 84b4bab983..e8a1ceacc1 100644 --- a/.github/CALL_FOR_CONTRIBUTIONS.md +++ b/.github/CALL_FOR_CONTRIBUTIONS.md @@ -1,7 +1,7 @@ # Call For Contributions Contributors looking for a task can look at the following list to find an item -to work on. Should you decide to contribute a component, please comment on the -corresponding GitHub issue that you will be working on the component. A team +to work on. Should you decide to contribute a component, please comment on the +corresponding GitHub issue that you will be working on the component. A team member will then follow up by assigning the issue to you. [There is a contributions welcome label available here](https://github.com/keras-team/keras-cv/issues?page=2&q=is%3Aissue+is%3Aopen+label%3Acontribution-welcome) diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md index ac4f804f2e..034eeee599 100644 --- a/.github/CONTRIBUTING.md +++ b/.github/CONTRIBUTING.md @@ -19,9 +19,9 @@ to open a PR without discussion. ### Step 2. Make code changes -To make code changes, you need to fork the repository. You will need to setup a +To make code changes, you need to fork the repository. You will need to set up a development environment and run the unit tests. This is covered in section -"Setup environment". +"set up environment". If your code change involves introducing a new API change, please see our [API Design Guidelines](API_DESIGN.md). @@ -43,7 +43,7 @@ The agreement can be found at [https://cla.developers.google.com/clas](https://c ### Step 5. Code review -CI tests will automatically be run directly on your pull request. Their +CI tests will automatically be run directly on your pull request. Their status will be reported back via GitHub actions. There may be @@ -92,7 +92,7 @@ We currently support only a small handful of ops that run on CPU and are not use If you are updating existing custom ops, you can re-compile the binaries from source using the instructions in the `Tests that require custom ops` section below. -## Setup environment +## set up environment Setting up your KerasCV development environment requires you to fork the KerasCV repository, clone the repository, install dependencies, and execute `python setup.py develop`. @@ -157,7 +157,7 @@ cp bazel-bin/keras_cv/custom_ops/*.so keras_cv/custom_ops/ Tests which use custom ops are disabled by default, but can be run by setting the environment variable `TEST_CUSTOM_OPS=true`. ## Formatting the Code -We use `flake8`, `isort`, `black` and `clang-format` for code formatting. You can run +We use `flake8`, `isort`, `black` and `clang-format` for code formatting. You can run the following commands manually every time you want to format your code: - Run `shell/format.sh` to format your code diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md index 52284e358d..5754b5830d 100644 --- a/.github/ISSUE_TEMPLATE/feature_request.md +++ b/.github/ISSUE_TEMPLATE/feature_request.md @@ -19,7 +19,7 @@ Include citation counts if possible. **Existing Implementations** **Other Information** diff --git a/.github/ROADMAP.md b/.github/ROADMAP.md index d1097a5f63..264fa85a91 100644 --- a/.github/ROADMAP.md +++ b/.github/ROADMAP.md @@ -1,6 +1,6 @@ # Roadmap The team will release 2 quarters of roadmap in advance so contributors will know -what we are working on and be better aligned when creating PRs. +what we are working on and be better aligned when creating PRs. As an exception, widely used backbones are always welcome. Contributors can search for `contribution-welcome` label on github. The team will release one minor version upgrade each quarter, or whenever a new task is officially supported. diff --git a/README.md b/README.md index f6b9402467..1aa92d80e2 100644 --- a/README.md +++ b/README.md @@ -109,9 +109,9 @@ We would like to leverage/outsource the Keras community not only for bug reporti but also for active development for feature delivery. To achieve this, here is the predefined process for how to contribute to this repository: -1) Contributors are always welcome to help us fix an issue, add tests, better documentation. +1) Contributors are always welcome to help us fix an issue, add tests, better documentation. 2) If contributors would like to create a backbone, we usually require a pre-trained weight set -with the model for one dataset as the first PR, and a training script as a follow-up. The training script will preferrably help us reproduce the results claimed from paper. The backbone should be generic but the training script can contain paper specific parameters such as learning rate schedules and weight decays. The training script will be used to produce leaderboard results. +with the model for one dataset as the first PR, and a training script as a follow-up. The training script will preferrably help us reproduce the results claimed from paper. The backbone should be generic but the training script can contain paper specific parameters such as learning rate schedules and weight decays. The training script will be used to produce leaderboard results. Exceptions apply to large transformer-based models which are difficult to train. If this is the case, contributors should let us know so the team can help in training the model or providing GCP resources. 3) If contributors would like to create a meta arch, please try to be aligned with our roadmap and create a PR for design review to make sure the meta arch is modular. @@ -137,7 +137,7 @@ An example of this can be found in the ImageNet classification training All results are reproducible using the training scripts in this repository. Historically, many models have been trained on image datasets rescaled via manually -crafted normalization schemes. +crafted normalization schemes. The most common variant of manually crafted normalization scheme is subtraction of the imagenet mean pixel followed by standard deviation normalization based on the imagenet pixel standard deviation. @@ -158,7 +158,7 @@ instructions below. ### Installing KerasCV with Custom Ops from Source Installing custom ops from source requires the [Bazel](https://bazel.build/) build -system (version >= 5.4.0). Steps to install Bazel can be [found here](https://github.com/keras-team/keras/blob/v2.11.0/.devcontainer/Dockerfile#L21-L23). +system (version >= 5.4.0). Steps to install Bazel can be [found here](https://github.com/keras-team/keras/blob/v2.11.0/.devcontainer/Dockerfile#L21-L23). ``` git clone https://github.com/keras-team/keras-cv.git diff --git a/benchmarks/metrics/coco/mean_average_precision_bucket_performance.py b/benchmarks/metrics/coco/mean_average_precision_bucket_performance.py index e05742ce40..d139697e93 100644 --- a/benchmarks/metrics/coco/mean_average_precision_bucket_performance.py +++ b/benchmarks/metrics/coco/mean_average_precision_bucket_performance.py @@ -18,9 +18,9 @@ def produce_random_data( """Generates a fake list of bounding boxes for use in this test. Returns: - a tensor list of size [128, 25, 5/6]. This represents 128 images, 25 bboxes - and 5/6 dimensions to represent each bbox depending on if confidence is - set. + a tensor list of size [128, 25, 5/6]. This represents 128 images, 25 + bboxes and 5/6 dimensions to represent each bbox depending on if + confidence is set. """ images = [] for _ in range(num_images): diff --git a/benchmarks/metrics/coco/mean_average_precision_performance.py b/benchmarks/metrics/coco/mean_average_precision_performance.py index 9dd4e3852b..660977ca9a 100644 --- a/benchmarks/metrics/coco/mean_average_precision_performance.py +++ b/benchmarks/metrics/coco/mean_average_precision_performance.py @@ -18,9 +18,9 @@ def produce_random_data( """Generates a fake list of bounding boxes for use in this test. Returns: - a tensor list of size [128, 25, 5/6]. This represents 128 images, 25 bboxes - and 5/6 dimensions to represent each bbox depending on if confidence is - set. + a tensor list of size [128, 25, 5/6]. This represents 128 images, 25 + bboxes and 5/6 dimensions to represent each bbox depending on if + confidence is set. """ images = [] for _ in range(num_images): diff --git a/benchmarks/metrics/coco/recall_performance.py b/benchmarks/metrics/coco/recall_performance.py index df61cd5cc1..9819a52608 100644 --- a/benchmarks/metrics/coco/recall_performance.py +++ b/benchmarks/metrics/coco/recall_performance.py @@ -18,9 +18,9 @@ def produce_random_data( """Generates a fake list of bounding boxes for use in this test. Returns: - a tensor list of size [128, 25, 5/6]. This represents 128 images, 25 bboxes - and 5/6 dimensions to represent each bbox depending on if confidence is - set. + a tensor list of size [128, 25, 5/6]. This represents 128 images, 25 + bboxes and 5/6 dimensions to represent each bbox depending on if + confidence is set. """ images = [] for _ in range(num_images): diff --git a/benchmarks/vectorization_strategy_benchmark.py b/benchmarks/vectorization_strategy_benchmark.py index 80db7351d6..18e9c52fb4 100644 --- a/benchmarks/vectorization_strategy_benchmark.py +++ b/benchmarks/vectorization_strategy_benchmark.py @@ -78,12 +78,13 @@ def fill_single_rectangle( """Fill rectangles with fill value into images. Args: - images: Tensor of images to fill rectangles into. + image: Tensor of images to fill rectangles into. centers_x: Tensor of positions of the rectangle centers on the x-axis. centers_y: Tensor of positions of the rectangle centers on the y-axis. widths: Tensor of widths of the rectangles heights: Tensor of heights of the rectangles - fill_values: Tensor with same shape as images to get rectangle fill from. + fill_values: Tensor with same shape as images to get rectangle fill + from. Returns: images with filled rectangles. """ @@ -127,7 +128,7 @@ def __init__( if fill_mode not in ["gaussian_noise", "constant"]: raise ValueError( '`fill_mode` should be "gaussian_noise" ' - f'or "constant". Got `fill_mode`={fill_mode}' + f'or "constant". Got `fill_mode`={fill_mode}' ) if not isinstance(self.height_lower, type(self.height_upper)): @@ -307,7 +308,7 @@ def __init__( if fill_mode not in ["gaussian_noise", "constant"]: raise ValueError( '`fill_mode` should be "gaussian_noise" ' - f'or "constant". Got `fill_mode`={fill_mode}' + f'or "constant". Got `fill_mode`={fill_mode}' ) if not isinstance(self.height_lower, type(self.height_upper)): @@ -481,7 +482,7 @@ def __init__( if fill_mode not in ["gaussian_noise", "constant"]: raise ValueError( '`fill_mode` should be "gaussian_noise" ' - f'or "constant". Got `fill_mode`={fill_mode}' + f'or "constant". Got `fill_mode`={fill_mode}' ) if not isinstance(self.height_lower, type(self.height_upper)): @@ -657,7 +658,7 @@ def __init__( if fill_mode not in ["gaussian_noise", "constant"]: raise ValueError( '`fill_mode` should be "gaussian_noise" ' - f'or "constant". Got `fill_mode`={fill_mode}' + f'or "constant". Got `fill_mode`={fill_mode}' ) if not isinstance(self.height_lower, type(self.height_upper)): @@ -837,7 +838,7 @@ def __init__( if fill_mode not in ["gaussian_noise", "constant"]: raise ValueError( '`fill_mode` should be "gaussian_noise" ' - f'or "constant". Got `fill_mode`={fill_mode}' + f'or "constant". Got `fill_mode`={fill_mode}' ) if not isinstance(self.height_lower, type(self.height_upper)): @@ -1011,7 +1012,7 @@ def __init__( if fill_mode not in ["gaussian_noise", "constant"]: raise ValueError( '`fill_mode` should be "gaussian_noise" ' - f'or "constant". Got `fill_mode`={fill_mode}' + f'or "constant". Got `fill_mode`={fill_mode}' ) if not isinstance(self.height_lower, type(self.height_upper)): @@ -1234,7 +1235,7 @@ def get_config(self): # Extra notes ## Warnings -it would be really annoying as a user to use an official keras_cv component and get -warned that "RandomUniform" or "RandomUniformInt" inside pfor may not get the same -output. +it would be really annoying as a user to use an official keras_cv component and +get warned that "RandomUniform" or "RandomUniformInt" inside pfor may not get +the same output. """ diff --git a/benchmarks/vectorized_auto_contrast.py b/benchmarks/vectorized_auto_contrast.py index 0c166b55c8..9237dd6aa9 100644 --- a/benchmarks/vectorized_auto_contrast.py +++ b/benchmarks/vectorized_auto_contrast.py @@ -28,15 +28,15 @@ class OldAutoContrast(BaseImageAugmentationLayer): """Performs the AutoContrast operation on an image. Auto contrast stretches the values of an image across the entire available - `value_range`. This makes differences between pixels more obvious. An example of - this is if an image only has values `[0, 1]` out of the range `[0, 255]`, auto - contrast will change the `1` values to be `255`. + `value_range`. This makes differences between pixels more obvious. An + example of this is if an image only has values `[0, 1]` out of the range + `[0, 255]`, auto contrast will change the `1` values to be `255`. Args: value_range: the range of values the incoming images will have. Represented as a two number tuple written [low, high]. This is typically either `[0, 1]` or `[0, 255]` depending - on how your preprocessing pipeline is setup. + on how your preprocessing pipeline is set up. """ def __init__( diff --git a/benchmarks/vectorized_channel_shuffle.py b/benchmarks/vectorized_channel_shuffle.py index c210679922..b75d3be43c 100644 --- a/benchmarks/vectorized_channel_shuffle.py +++ b/benchmarks/vectorized_channel_shuffle.py @@ -38,15 +38,17 @@ class OldChannelShuffle(BaseImageAugmentationLayer): `(..., height, width, channels)`, in `"channels_last"` format Args: - groups: Number of groups to divide the input channels. Default 3. + groups: Number of groups to divide the input channels, defaults to 3. seed: Integer. Used to create a random seed. Call arguments: inputs: Tensor representing images of shape - `(batch_size, width, height, channels)`, with dtype tf.float32 / tf.uint8, - ` or (width, height, channels)`, with dtype tf.float32 / tf.uint8 - training: A boolean argument that determines whether the call should be run - in inference mode or training mode. Default: True. + `(batch_size, width, height, channels)`, with dtype + tf.float32 / tf.uint8, + ` or (width, height, channels)`, with dtype + tf.float32 / tf.uint8 + training: A boolean argument that determines whether the call should be + run in inference mode or training mode, defaults to True. Usage: ```python diff --git a/benchmarks/vectorized_grayscale.py b/benchmarks/vectorized_grayscale.py index e58ec681fd..cea07745c8 100644 --- a/benchmarks/vectorized_grayscale.py +++ b/benchmarks/vectorized_grayscale.py @@ -25,7 +25,8 @@ class OldGrayscale(BaseImageAugmentationLayer): - """Grayscale is a preprocessing layer that transforms RGB images to Grayscale images. + """Grayscale is a preprocessing layer that transforms RGB images to + Grayscale images. Input images should have values in the range of [0, 255]. Input shape: 3D (unbatched) or 4D (batched) tensor with shape: diff --git a/benchmarks/vectorized_mosaic.py b/benchmarks/vectorized_mosaic.py index d1e7d12457..745022764f 100644 --- a/benchmarks/vectorized_mosaic.py +++ b/benchmarks/vectorized_mosaic.py @@ -24,10 +24,10 @@ from keras_cv.layers.preprocessing.base_image_augmentation_layer import ( BaseImageAugmentationLayer, ) -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 IMAGES, ) -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 LABELS, ) from keras_cv.utils import preprocessing as preprocessing_utils @@ -38,7 +38,7 @@ class OldMosaic(BaseImageAugmentationLayer): Mosaic data augmentation first takes 4 images from the batch and makes a grid. After that based on the offset, a crop is taken to form the mosaic - image. Labels are in the same ratio as the the area of their images in the + image. Labels are in the same ratio as the area of their images in the output image. Bounding boxes are translated according to the position of the 4 images. @@ -58,9 +58,8 @@ class OldMosaic(BaseImageAugmentationLayer): may contain additional information such as classes and confidence after these 4 values but these values will be ignored and returned as is. For detailed information on the supported formats, see the - [KerasCV bounding box documentation](https://keras.io/api/keras_cv/bounding_box/formats/). - Defualts to None. - seed: Integer. Used to create a random seed. + [KerasCV bounding box documentation](https://keras.io/api/keras_cv/bounding_box/formats/). Defaults to None. + seed: integer, used to create a random seed. References: - [Yolov4 paper](https://arxiv.org/pdf/2004.10934). @@ -76,7 +75,7 @@ class OldMosaic(BaseImageAugmentationLayer): output = mosaic({'images': images, 'labels': labels}) # output == {'images': updated_images, 'labels': updated_labels} ``` - """ + """ # noqa: E501 def __init__( self, offset=(0.25, 0.75), bounding_box_format=None, seed=None, **kwargs @@ -196,9 +195,9 @@ def _batch_augment(self, inputs): def _augment(self, inputs): raise ValueError( - "Mosaic received a single image to `call`. The layer relies on " + "Mosaic received a single image to `call`. The layer relies on " "combining multiple examples, and as such will not behave as " - "expected. Please call the layer with 4 or more samples." + "expected. Please call the layer with 4 or more samples." ) def _update_image(self, images, permutation_order, mosaic_centers, index): diff --git a/benchmarks/vectorized_random_brightness.py b/benchmarks/vectorized_random_brightness.py index 8df636f4b7..9babe0d692 100644 --- a/benchmarks/vectorized_random_brightness.py +++ b/benchmarks/vectorized_random_brightness.py @@ -46,9 +46,9 @@ class OldRandomBrightness(BaseImageAugmentationLayer): is provided, eg, 0.2, then -0.2 will be used for lower bound and 0.2 will be used for upper bound. value_range: Optional list/tuple of 2 floats for the lower and upper limit - of the values of the input data. Defaults to [0.0, 255.0]. Can be + of the values of the input data, defaults to [0.0, 255.0]. Can be changed to e.g. [0.0, 1.0] if the image input has been scaled before - this layer. The brightness adjustment will be scaled to this range, and + this layer. The brightness adjustment will be scaled to this range, and the output values will be clipped to this range. seed: optional integer, for fixed RNG behavior. Inputs: 3D (HWC) or 4D (NHWC) tensor, with float or int dtype. Input pixel diff --git a/benchmarks/vectorized_random_color_jitter.py b/benchmarks/vectorized_random_color_jitter.py index 1d7a6eee2c..7c1d210a30 100644 --- a/benchmarks/vectorized_random_color_jitter.py +++ b/benchmarks/vectorized_random_color_jitter.py @@ -40,10 +40,10 @@ class OldRandomColorJitter(BaseImageAugmentationLayer): `(..., height, width, channels)`, in `channels_last` format Args: - value_range: the range of values the incoming images will have. + value_range: the range of values the incoming images will have. Represented as a two number tuple written [low, high]. This is typically either `[0, 1]` or `[0, 255]` depending - on how your preprocessing pipeline is setup. + on how your preprocessing pipeline is set up. brightness_factor: Float or a list/tuple of 2 floats between -1.0 and 1.0. The factor is used to determine the lower bound and upper bound of the brightness adjustment. A float value will be @@ -65,11 +65,11 @@ class OldRandomColorJitter(BaseImageAugmentationLayer): `keras_cv.FactorSampler`. `factor` controls the extent to which the image sharpness is impacted. `factor=0.0` makes this layer perform a no-op operation, while a value of 1.0 performs the most aggressive - contrast adjustment available. If a tuple is used, a `factor` is sampled - between the two values for every image augmented. If a single float - is used, a value between `0.0` and the passed float is sampled. - In order to ensure the value is always the same, please pass a tuple - with two identical floats: `(0.5, 0.5)`. + contrast adjustment available. If a tuple is used, a `factor` is + sampled between the two values for every image augmented. If a + single float is used, a value between `0.0` and the passed float is + sampled. In order to ensure the value is always the same, please + pass a tuple with two identical floats: `(0.5, 0.5)`. seed: Integer. Used to create a random seed. Usage: diff --git a/benchmarks/vectorized_random_hue.py b/benchmarks/vectorized_random_hue.py index 71f9b57def..faf1615267 100644 --- a/benchmarks/vectorized_random_hue.py +++ b/benchmarks/vectorized_random_hue.py @@ -37,18 +37,19 @@ class OldRandomHue(BaseImageAugmentationLayer): hue channel (H) by delta. The image is then converted back to RGB. Args: - factor: A tuple of two floats, a single float or `keras_cv.FactorSampler`. - `factor` controls the extent to which the image hue is impacted. - `factor=0.0` makes this layer perform a no-op operation, while a value of - 1.0 performs the most aggressive contrast adjustment available. If a tuple - is used, a `factor` is sampled between the two values for every image - augmented. If a single float is used, a value between `0.0` and the passed - float is sampled. In order to ensure the value is always the same, please + factor: A tuple of two floats, a single float or + `keras_cv.FactorSampler`. `factor` controls the extent to which the + image hue is impacted. `factor=0.0` makes this layer perform a no-op + operation, while a value of 1.0 performs the most aggressive + contrast adjustment available. If a tuple is used, a `factor` is + sampled between the two values for every image augmented. If a + single float is used, a value between `0.0` and the passed float is + sampled. In order to ensure the value is always the same, please pass a tuple with two identical floats: `(0.5, 0.5)`. - value_range: the range of values the incoming images will have. + value_range: the range of values the incoming images will have. Represented as a two number tuple written [low, high]. This is typically either `[0, 1]` or `[0, 255]` depending - on how your preprocessing pipeline is setup. + on how your preprocessing pipeline is set up. seed: Integer. Used to create a random seed. """ @@ -63,9 +64,10 @@ def __init__(self, factor, value_range, seed=None, **kwargs): def get_random_transformation(self, **kwargs): invert = preprocessing_utils.random_inversion(self._random_generator) - # We must scale self.factor() to the range [-0.5, 0.5]. This is because the - # tf.image operation performs rotation on the hue saturation value orientation. - # This can be thought of as an angle in the range [-180, 180] + # We must scale self.factor() to the range [-0.5, 0.5]. This is because + # the tf.image operation performs rotation on the hue saturation value + # orientation. This can be thought of as an angle in the range + # [-180, 180] return invert * self.factor() * 0.5 def augment_image(self, image, transformation=None, **kwargs): diff --git a/benchmarks/vectorized_random_rotation.py b/benchmarks/vectorized_random_rotation.py index e94b4a883b..0c8821f632 100644 --- a/benchmarks/vectorized_random_rotation.py +++ b/benchmarks/vectorized_random_rotation.py @@ -41,7 +41,7 @@ class OldRandomRotation(BaseImageAugmentationLayer): rotations at inference time, set `training` to True when calling the layer. Input pixel values can be of any range (e.g. `[0., 1.)` or `[0, 255]`) and - of interger or floating point dtype. By default, the layer will output + of integer or floating point dtype. By default, the layer will output floats. Input shape: @@ -154,9 +154,10 @@ def augment_bounding_boxes( ): if self.bounding_box_format is None: raise ValueError( - "`RandomRotation()` was called with bounding boxes," - "but no `bounding_box_format` was specified in the constructor." - "Please specify a bounding box format in the constructor. i.e." + "`RandomRotation()` was called with bounding boxes, " + "but no `bounding_box_format` was specified in the " + "constructor. Please specify a bounding box format in the " + "constructor. i.e. " "`RandomRotation(bounding_box_format='xyxy')`" ) @@ -190,7 +191,7 @@ def augment_bounding_boxes( ) # point_x : x coordinates of all corners of the bounding box point_x = tf.gather(point, [0], axis=2) - # point_y : y cordinates of all corners of the bounding box + # point_y : y coordinates of all corners of the bounding box point_y = tf.gather(point, [1], axis=2) # rotated bounding box coordinates # new_x : new position of x coordinates of corners of bounding box @@ -217,9 +218,9 @@ def augment_bounding_boxes( out = tf.concat([new_x, new_y], axis=2) # find readjusted coordinates of bounding box to represent it in corners # format - min_cordinates = tf.math.reduce_min(out, axis=1) - max_cordinates = tf.math.reduce_max(out, axis=1) - boxes = tf.concat([min_cordinates, max_cordinates], axis=1) + min_coordinates = tf.math.reduce_min(out, axis=1) + max_coordinates = tf.math.reduce_max(out, axis=1) + boxes = tf.concat([min_coordinates, max_coordinates], axis=1) bounding_boxes = bounding_boxes.copy() bounding_boxes["boxes"] = boxes @@ -228,7 +229,7 @@ def augment_bounding_boxes( bounding_box_format="xyxy", images=image, ) - # cordinates cannot be float values, it is casted to int32 + # coordinates cannot be float values, it is casted to int32 bounding_boxes = bounding_box.convert_format( bounding_boxes, source="xyxy", @@ -244,10 +245,10 @@ def augment_label(self, label, transformation, **kwargs): def augment_segmentation_mask( self, segmentation_mask, transformation, **kwargs ): - # If segmentation_classes is specified, we have a dense segmentation mask. - # We therefore one-hot encode before rotation to avoid bad interpolation - # during the rotation transformation. We then make the mask sparse - # again using tf.argmax. + # If segmentation_classes is specified, we have a dense segmentation + # mask. We therefore one-hot encode before rotation to avoid bad + # interpolation during the rotation transformation. We then make the + # mask sparse again using tf.argmax. if self.segmentation_classes: one_hot_mask = tf.one_hot( tf.squeeze(segmentation_mask, axis=-1), @@ -268,7 +269,8 @@ def augment_segmentation_mask( ) rotated_mask = self._rotate_image(segmentation_mask, transformation) # Round because we are in one-hot encoding, and we may have - # pixels with ambugious value due to floating point math for rotation. + # pixels with ambiguous value due to floating point math for + # rotation. return tf.round(rotated_mask) def get_config(self): diff --git a/benchmarks/vectorized_random_saturation.py b/benchmarks/vectorized_random_saturation.py index 730f4f687c..1ef6166e84 100644 --- a/benchmarks/vectorized_random_saturation.py +++ b/benchmarks/vectorized_random_saturation.py @@ -34,16 +34,16 @@ class OldRandomSaturation(BaseImageAugmentationLayer): Call the layer with `training=True` to adjust the saturation of the input. Args: - factor: A tuple of two floats, a single float or `keras_cv.FactorSampler`. - `factor` controls the extent to which the image saturation is impacted. - `factor=0.5` makes this layer perform a no-op operation. `factor=0.0` makes - the image to be fully grayscale. `factor=1.0` makes the image to be fully - saturated. - Values should be between `0.0` and `1.0`. If a tuple is used, a `factor` - is sampled between the two values for every image augmented. If a single - float is used, a value between `0.0` and the passed float is sampled. - In order to ensure the value is always the same, please pass a tuple with - two identical floats: `(0.5, 0.5)`. + factor: A tuple of two floats, a single float or + `keras_cv.FactorSampler`. `factor` controls the extent to which the + image saturation is impacted. `factor=0.5` makes this layer perform + a no-op operation. `factor=0.0` makes the image to be fully + grayscale. `factor=1.0` makes the image to be fully saturated. + Values should be between `0.0` and `1.0`. If a tuple is used, a + `factor` is sampled between the two values for every image + augmented. If a single float is used, a value between `0.0` and the + passed float is sampled. In order to ensure the value is always the + same, please pass a tuple with two identical floats: `(0.5, 0.5)`. seed: Integer. Used to create a random seed. """ @@ -61,9 +61,9 @@ def get_random_transformation(self, **kwargs): def augment_image(self, image, transformation=None, **kwargs): # Convert the factor range from [0, 1] to [0, +inf]. Note that the - # tf.image.adjust_saturation is trying to apply the following math formula - # `output_saturation = input_saturation * factor`. We use the following - # method to the do the mapping. + # tf.image.adjust_saturation is trying to apply the following math + # formula `output_saturation = input_saturation * factor`. We use the + # following method to the do the mapping. # `y = x / (1 - x)`. # This will ensure: # y = +inf when x = 1 (full saturation) @@ -71,8 +71,8 @@ def augment_image(self, image, transformation=None, **kwargs): # y = 0 when x = 0 (full gray scale) # Convert the transformation to tensor in case it is a float. When - # transformation is 1.0, then it will result in to divide by zero error, but - # it will be handled correctly when it is a one tensor. + # transformation is 1.0, then it will result in to divide by zero error, + # but it will be handled correctly when it is a one tensor. transformation = tf.convert_to_tensor(transformation) adjust_factor = transformation / (1 - transformation) return tf.image.adjust_saturation( diff --git a/benchmarks/vectorized_random_sharpness.py b/benchmarks/vectorized_random_sharpness.py index 7a67697e65..3d0422d414 100644 --- a/benchmarks/vectorized_random_sharpness.py +++ b/benchmarks/vectorized_random_sharpness.py @@ -13,29 +13,30 @@ class OldRandomSharpness(BaseImageAugmentationLayer): """Randomly performs the sharpness operation on given images. - The sharpness operation first performs a blur operation, then blends between the - original image and the blurred image. This operation makes the edges of an image - less sharp than they were in the original image. + The sharpness operation first performs a blur operation, then blends between + the original image and the blurred image. This operation makes the edges of + an image less sharp than they were in the original image. References: - [PIL](https://pillow.readthedocs.io/en/stable/reference/ImageEnhance.html) Args: - factor: A tuple of two floats, a single float or `keras_cv.FactorSampler`. - `factor` controls the extent to which the image sharpness is impacted. - `factor=0.0` makes this layer perform a no-op operation, while a value of - 1.0 uses the sharpened result entirely. Values between 0 and 1 result in - linear interpolation between the original image and the sharpened image. - Values should be between `0.0` and `1.0`. If a tuple is used, a `factor` is - sampled between the two values for every image augmented. If a single float - is used, a value between `0.0` and the passed float is sampled. In order to - ensure the value is always the same, please pass a tuple with two identical - floats: `(0.5, 0.5)`. + factor: A tuple of two floats, a single float or + `keras_cv.FactorSampler`. `factor` controls the extent to which the + image sharpness is impacted. `factor=0.0` makes this layer perform a + no-op operation, while a value of 1.0 uses the sharpened result + entirely. Values between 0 and 1 result in linear interpolation + between the original image and the sharpened image. Values should be + between `0.0` and `1.0`. If a tuple is used, a `factor` is sampled + between the two values for every image augmented. If a single float + is used, a value between `0.0` and the passed float is sampled. In + order to ensure the value is always the same, please pass a tuple + with two identical floats: `(0.5, 0.5)`. value_range: the range of values the incoming images will have. Represented as a two number tuple written [low, high]. This is typically either `[0, 1]` or `[0, 255]` depending - on how your preprocessing pipeline is setup. - """ + on how your preprocessing pipeline is set up. + """ # noqa: E501 def __init__( self, @@ -68,8 +69,8 @@ def augment_image(self, image, transformation=None, **kwargs): # [1 5 1] # [1 1 1] # all divided by 13 is the default 3x3 gaussian smoothing kernel. - # Correlating or Convolving with this filter is equivalent to performing a - # gaussian blur. + # Correlating or Convolving with this filter is equivalent to performing + # a gaussian blur. kernel = ( tf.constant( [[1, 1, 1], [1, 5, 1], [1, 1, 1]], diff --git a/benchmarks/vectorized_random_shear.py b/benchmarks/vectorized_random_shear.py index 43b64703e6..f655a45e7b 100644 --- a/benchmarks/vectorized_random_shear.py +++ b/benchmarks/vectorized_random_shear.py @@ -33,33 +33,32 @@ class OldRandomShear(BaseImageAugmentationLayer): `(..., height, width, channels)`, in `"channels_last"` format Args: x_factor: A tuple of two floats, a single float or a - `keras_cv.FactorSampler`. For each augmented image a value is sampled - from the provided range. If a float is passed, the range is interpreted as - `(0, x_factor)`. Values represent a percentage of the image to shear over. - For example, 0.3 shears pixels up to 30% of the way across the image. - All provided values should be positive. If `None` is passed, no shear - occurs on the X axis. - Defaults to `None`. + `keras_cv.FactorSampler`. For each augmented image a value is + sampled from the provided range. If a float is passed, the range is + interpreted as `(0, x_factor)`. Values represent a percentage of the + image to shear over. For example, 0.3 shears pixels up to 30% of the + way across the image. All provided values should be positive. If + `None` is passed, no shear occurs on the X axis. Defaults to `None`. y_factor: A tuple of two floats, a single float or a - `keras_cv.FactorSampler`. For each augmented image a value is sampled - from the provided range. If a float is passed, the range is interpreted as - `(0, y_factor)`. Values represent a percentage of the image to shear over. - For example, 0.3 shears pixels up to 30% of the way across the image. - All provided values should be positive. If `None` is passed, no shear - occurs on the Y axis. - Defaults to `None`. - interpolation: interpolation method used in the `ImageProjectiveTransformV3` op. - Supported values are `"nearest"` and `"bilinear"`. - Defaults to `"bilinear"`. - fill_mode: fill_mode in the `ImageProjectiveTransformV3` op. - Supported values are `"reflect"`, `"wrap"`, `"constant"`, and `"nearest"`. - Defaults to `"reflect"`. + `keras_cv.FactorSampler`. For each augmented image a value is + sampled from the provided range. If a float is passed, the range is + interpreted as `(0, y_factor)`. Values represent a percentage of the + image to shear over. For example, 0.3 shears pixels up to 30% of the + way across the image. All provided values should be positive. If + `None` is passed, no shear occurs on the Y axis. Defaults to `None`. + interpolation: interpolation method used in the + `ImageProjectiveTransformV3` op. Supported values are `"nearest"` + and `"bilinear"`. Defaults to `"bilinear"`. + fill_mode: fill_mode in the `ImageProjectiveTransformV3` op. Supported + values are `"reflect"`, `"wrap"`, `"constant"`, and `"nearest"`. + Defaults to `"reflect"`. fill_value: fill_value in the `ImageProjectiveTransformV3` op. - A `Tensor` of type `float32`. The value to be filled when fill_mode is - constant". Defaults to `0.0`. - bounding_box_format: The format of bounding boxes of input dataset. Refer to - https://github.com/keras-team/keras-cv/blob/master/keras_cv/bounding_box/converters.py - for more details on supported bounding box formats. + A `Tensor` of type `float32`. The value to be filled when fill_mode + is constant". Defaults to `0.0`. + bounding_box_format: The format of bounding boxes of input dataset. + Refer to + https://github.com/keras-team/keras-cv/blob/master/keras_cv/bounding_box/converters.py + for more details on supported bounding box formats. seed: Integer. Used to create a random seed. """ @@ -89,8 +88,9 @@ def __init__( self.y_factor = y_factor if x_factor is None and y_factor is None: warnings.warn( - "RandomShear received both `x_factor=None` and `y_factor=None`. As a " - "result, the layer will perform no augmentation." + "RandomShear received both `x_factor=None` and " + "`y_factor=None`. As a result, the layer will perform no " + "augmentation." ) self.interpolation = interpolation self.fill_mode = fill_mode @@ -149,10 +149,10 @@ def augment_bounding_boxes( ): if self.bounding_box_format is None: raise ValueError( - "`RandomShear()` was called with bounding boxes," - "but no `bounding_box_format` was specified in the constructor." - "Please specify a bounding box format in the constructor. i.e." - "`RandomShear(bounding_box_format='xyxy')`" + "`RandomShear()` was called with bounding boxes, " + "but no `bounding_box_format` was specified in the " + "constructor. Please specify a bounding box format in the " + "constructor. i.e. `RandomShear(bounding_box_format='xyxy')`" ) bounding_boxes = keras_cv.bounding_box.convert_format( bounding_boxes, diff --git a/benchmarks/vectorized_random_translation.py b/benchmarks/vectorized_random_translation.py index 2be78c5472..2146d7cae4 100644 --- a/benchmarks/vectorized_random_translation.py +++ b/benchmarks/vectorized_random_translation.py @@ -94,7 +94,7 @@ class OldRandomTranslation(BaseImageAugmentationLayer): shifting image down. When represented as a single positive float, this value is used for both the upper and lower bound. For instance, `height_factor=(-0.2, 0.3)` results in an output shifted by a random - amount in the range `[-20%, +30%]`. `height_factor=0.2` results in an + amount in the range `[-20%, +30%]`. `height_factor=0.2` results in an output height shifted by a random amount in the range `[-20%, +20%]`. width_factor: a float represented as fraction of value, or a tuple of size 2 representing lower and upper bound for shifting horizontally. A diff --git a/benchmarks/vectorized_randomly_zoomed_crop.py b/benchmarks/vectorized_randomly_zoomed_crop.py index ba30bec13a..66a12fec41 100644 --- a/benchmarks/vectorized_randomly_zoomed_crop.py +++ b/benchmarks/vectorized_randomly_zoomed_crop.py @@ -48,12 +48,12 @@ class OldRandomlyZoomedCrop(BaseImageAugmentationLayer): aspect ratio sampled represents a value to distort the aspect ratio by. Represents the lower and upper bound for the aspect ratio of the - cropped image before resizing it to `(height, width)`. For most - tasks, this should be `(3/4, 4/3)`. To perform a no-op provide the + cropped image before resizing it to `(height, width)`. For most + tasks, this should be `(3/4, 4/3)`. To perform a no-op provide the value `(1.0, 1.0)`. interpolation: (Optional) A string specifying the sampling method for - resizing. Defaults to "bilinear". - seed: (Optional) Used to create a random seed. Defaults to None. + resizing, defaults to "bilinear". + seed: (Optional) Used to create a random seed, defaults to None. """ def __init__( diff --git a/examples/benchmarking/imagenet_v2.py b/examples/benchmarking/imagenet_v2.py index d79beba785..b536c8c091 100644 --- a/examples/benchmarking/imagenet_v2.py +++ b/examples/benchmarking/imagenet_v2.py @@ -16,7 +16,8 @@ Author: [DavidLandup0](https://github.com/DavidLandup0) Date created: 2022/12/14 Last modified: 2022/12/14 -Description: Use KerasCV architectures and benchmark them against ImageNetV2 from TensorFlow Datasets +Description: Use KerasCV architectures and benchmark them against ImageNetV2 +from TensorFlow Datasets """ import sys @@ -39,7 +40,8 @@ flags.DEFINE_string( "model_kwargs", "{}", - "Keyword argument dictionary to pass to the constructor of the model being evaluated.", + "Keyword argument dictionary to pass to the constructor of the model being" + " evaluated.", ) flags.DEFINE_integer( @@ -81,8 +83,8 @@ def preprocess_image(img, label): # Todo -# Include imagenet_val and imagenet_real as well and report -# results for all three +# Include imagenet_val and imagenet_real as well and report +# results for all three (test_set), info = tfds.load( "imagenet_v2", split=["test"], as_supervised=True, with_info=True ) @@ -95,11 +97,13 @@ def preprocess_image(img, label): ) # Todo -# Create a nicer report, include inference time -# model size, etc. +# Create a nicer report, include inference time +# model size, etc. loss, acc, top_5 = model.evaluate(test_set, verbose=0) print( - f"Benchmark results:\n{'='*25}\n{FLAGS.model_name} achieves: \n - Top-1 Accuracy: {acc*100} \n - Top-5 Accuracy: {top_5*100} \non ImageNetV2 with setup:" + f"Benchmark results:\n{'='*25}\n{FLAGS.model_name} achieves: \n - Top-1 " + f"Accuracy: {acc*100} \n - Top-5 Accuracy: {top_5*100} \non ImageNetV2 " + "with setup:" ) print( f"- model_name: {FLAGS.model_name}\n" diff --git a/examples/layers/preprocessing/bounding_box/random_crop_and_resize_demo.py b/examples/layers/preprocessing/bounding_box/random_crop_and_resize_demo.py index ceb95af86a..93e2d21f46 100644 --- a/examples/layers/preprocessing/bounding_box/random_crop_and_resize_demo.py +++ b/examples/layers/preprocessing/bounding_box/random_crop_and_resize_demo.py @@ -12,8 +12,8 @@ # See the License for the specific language governing permissions and # limitations under the License. """ -random_crop_and_resize_demo.py shows how to use the RandomCropAndResize preprocessing layer for -object detection. +random_crop_and_resize_demo.py shows how to use the RandomCropAndResize +preprocessing layer for object detection. """ import demo_utils import tensorflow as tf diff --git a/examples/layers/preprocessing/bounding_box/random_translation_demo.py b/examples/layers/preprocessing/bounding_box/random_translation_demo.py index 58a758dd51..adcde4ef92 100644 --- a/examples/layers/preprocessing/bounding_box/random_translation_demo.py +++ b/examples/layers/preprocessing/bounding_box/random_translation_demo.py @@ -12,8 +12,8 @@ # See the License for the specific language governing permissions and # limitations under the License. """ -random_translation_demo.py shows how to use the RandomTranslation preprocessing layer for -object detection. +random_translation_demo.py shows how to use the RandomTranslation preprocessing +layer for object detection. """ import demo_utils import tensorflow as tf diff --git a/examples/layers/preprocessing/classification/aug_mix_demo.py b/examples/layers/preprocessing/classification/aug_mix_demo.py index 627fd9b7c3..26d6f23ae3 100644 --- a/examples/layers/preprocessing/classification/aug_mix_demo.py +++ b/examples/layers/preprocessing/classification/aug_mix_demo.py @@ -13,7 +13,7 @@ # limitations under the License. """aug_mix_demo.py shows how to use the AugMix preprocessing layer. -Operates on the oxford_flowers102 dataset. In this script the flowers +Operates on the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/channel_shuffle_demo.py b/examples/layers/preprocessing/classification/channel_shuffle_demo.py index 0d706af41e..4860936eb6 100644 --- a/examples/layers/preprocessing/classification/channel_shuffle_demo.py +++ b/examples/layers/preprocessing/classification/channel_shuffle_demo.py @@ -11,7 +11,8 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -"""channel_shuffle_demo.py shows how to use the ChannelShuffle preprocessing layer. +"""channel_shuffle_demo.py shows how to use the ChannelShuffle preprocessing +layer. Operates on the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. diff --git a/examples/layers/preprocessing/classification/cut_mix_demo.py b/examples/layers/preprocessing/classification/cut_mix_demo.py index c73c0d319c..609992a9d3 100644 --- a/examples/layers/preprocessing/classification/cut_mix_demo.py +++ b/examples/layers/preprocessing/classification/cut_mix_demo.py @@ -13,7 +13,7 @@ # limitations under the License. """cut_mix_demo.py shows how to use the CutMix preprocessing layer. -Operates on the oxford_flowers102 dataset. In this script the flowers +Operates on the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/fourier_mix_demo.py b/examples/layers/preprocessing/classification/fourier_mix_demo.py index 78187257fb..9e74787647 100644 --- a/examples/layers/preprocessing/classification/fourier_mix_demo.py +++ b/examples/layers/preprocessing/classification/fourier_mix_demo.py @@ -12,7 +12,7 @@ # See the License for the specific language governing permissions and # limitations under the License. """fourier_mix_demo.py shows how to use the FourierMix preprocessing layer. -Uses the oxford_flowers102 dataset. In this script the flowers +Uses the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/grid_mask_demo.py b/examples/layers/preprocessing/classification/grid_mask_demo.py index 9b993c4459..315a783411 100644 --- a/examples/layers/preprocessing/classification/grid_mask_demo.py +++ b/examples/layers/preprocessing/classification/grid_mask_demo.py @@ -13,7 +13,7 @@ # limitations under the License. """gridmask_demo.py shows how to use the GridMask preprocessing layer. -Operates on the oxford_flowers102 dataset. In this script the flowers +Operates on the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/mix_up_demo.py b/examples/layers/preprocessing/classification/mix_up_demo.py index aaa0107041..79a0145889 100644 --- a/examples/layers/preprocessing/classification/mix_up_demo.py +++ b/examples/layers/preprocessing/classification/mix_up_demo.py @@ -13,7 +13,7 @@ # limitations under the License. """mix_up_demo.py shows how to use the MixUp preprocessing layer. -Uses the oxford_flowers102 dataset. In this script the flowers +Uses the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/mosaic_demo.py b/examples/layers/preprocessing/classification/mosaic_demo.py index 4ea38629fc..26cf0eb7c9 100644 --- a/examples/layers/preprocessing/classification/mosaic_demo.py +++ b/examples/layers/preprocessing/classification/mosaic_demo.py @@ -13,7 +13,7 @@ # limitations under the License. """mosaic_demo.py shows how to use the Mosaic preprocessing layer. -Operates on the oxford_flowers102 dataset. In this script the flowers +Operates on the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/rand_augment_demo.py b/examples/layers/preprocessing/classification/rand_augment_demo.py index 133f59aaa7..22abd9dc16 100644 --- a/examples/layers/preprocessing/classification/rand_augment_demo.py +++ b/examples/layers/preprocessing/classification/rand_augment_demo.py @@ -13,7 +13,7 @@ # limitations under the License. """rand_augment_demo.py shows how to use the RandAugment preprocessing layer. -Uses the oxford_flowers102 dataset. In this script the flowers +Uses the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/random_augmentation_pipeline_demo.py b/examples/layers/preprocessing/classification/random_augmentation_pipeline_demo.py index 87fb472c9e..495e79d4a2 100644 --- a/examples/layers/preprocessing/classification/random_augmentation_pipeline_demo.py +++ b/examples/layers/preprocessing/classification/random_augmentation_pipeline_demo.py @@ -13,7 +13,7 @@ # limitations under the License. """rand_augment_demo.py shows how to use the RandAugment preprocessing layer. -Uses the oxford_flowers102 dataset. In this script the flowers +Uses the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/random_brightness_demo.py b/examples/layers/preprocessing/classification/random_brightness_demo.py index d3586f6a1b..0a34ee49f9 100644 --- a/examples/layers/preprocessing/classification/random_brightness_demo.py +++ b/examples/layers/preprocessing/classification/random_brightness_demo.py @@ -11,9 +11,10 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -"""random_brightness_demo.py shows how to use the RandomBrightness preprocessing layer. +"""random_brightness_demo.py shows how to use the RandomBrightness preprocessing +layer. -Operates on the oxford_flowers102 dataset. In this script the flowers +Operates on the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/random_channel_shift_demo.py b/examples/layers/preprocessing/classification/random_channel_shift_demo.py index 34ffb33533..d54fbca878 100644 --- a/examples/layers/preprocessing/classification/random_channel_shift_demo.py +++ b/examples/layers/preprocessing/classification/random_channel_shift_demo.py @@ -13,9 +13,11 @@ # limitations under the License. -"""random_channel_shift_demo.py shows how to use the RandomChannelShift preprocessing -layer. Operates on the oxford_flowers102 dataset. In this script the flowers -are loaded, then are passed through the preprocessing layers. +"""random_channel_shift_demo.py shows how to use the RandomChannelShift +preprocessing layer. + +Operates on the oxford_flowers102 dataset. In this script the flowers are +loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/random_color_degeneration_demo.py b/examples/layers/preprocessing/classification/random_color_degeneration_demo.py index aefbaa37ff..f90d1a3438 100644 --- a/examples/layers/preprocessing/classification/random_color_degeneration_demo.py +++ b/examples/layers/preprocessing/classification/random_color_degeneration_demo.py @@ -13,7 +13,7 @@ # limitations under the License. """random_color_degeneration_demo.py shows how to use RandomColorDegeneration. -Operates on the oxford_flowers102 dataset. In this script the flowers +Operates on the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/random_color_jitter_demo.py b/examples/layers/preprocessing/classification/random_color_jitter_demo.py index 1d92b81389..b04d5bca1e 100644 --- a/examples/layers/preprocessing/classification/random_color_jitter_demo.py +++ b/examples/layers/preprocessing/classification/random_color_jitter_demo.py @@ -14,7 +14,7 @@ """color_jitter_demo.py shows how to use the ColorJitter preprocessing layer. -Operates on the oxford_flowers102 dataset. In this script the flowers +Operates on the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/random_contrast_demo.py b/examples/layers/preprocessing/classification/random_contrast_demo.py index 61ba24a895..bdb1bb8f59 100644 --- a/examples/layers/preprocessing/classification/random_contrast_demo.py +++ b/examples/layers/preprocessing/classification/random_contrast_demo.py @@ -11,9 +11,10 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -"""random_contrast_demo.py shows how to use the RandomContrast preprocessing layer. +"""random_contrast_demo.py shows how to use the RandomContrast preprocessing +layer. -Operates on the oxford_flowers102 dataset. In this script the flowers +Operates on the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/random_cutout_demo.py b/examples/layers/preprocessing/classification/random_cutout_demo.py index 9227903a9b..1391c20e79 100644 --- a/examples/layers/preprocessing/classification/random_cutout_demo.py +++ b/examples/layers/preprocessing/classification/random_cutout_demo.py @@ -13,7 +13,7 @@ # limitations under the License. """random_cutout_demo.py shows how to use the RandomCutout preprocessing layer. -Operates on the oxford_flowers102 dataset. In this script the flowers +Operates on the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/random_gaussian_blur_demo.py b/examples/layers/preprocessing/classification/random_gaussian_blur_demo.py index 8d8478e30f..0360296f48 100644 --- a/examples/layers/preprocessing/classification/random_gaussian_blur_demo.py +++ b/examples/layers/preprocessing/classification/random_gaussian_blur_demo.py @@ -11,8 +11,10 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -"""random_gaussian_blur_demo.py shows how to use the RandomHue preprocessing layer. -Operates on the oxford_flowers102 dataset. In this script the flowers +"""random_gaussian_blur_demo.py shows how to use the RandomHue preprocessing +layer. + +Operates on the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/random_hue_demo.py b/examples/layers/preprocessing/classification/random_hue_demo.py index 6cbb99a262..b09b18703b 100644 --- a/examples/layers/preprocessing/classification/random_hue_demo.py +++ b/examples/layers/preprocessing/classification/random_hue_demo.py @@ -12,7 +12,7 @@ # See the License for the specific language governing permissions and # limitations under the License. """random_hue_demo.py shows how to use the RandomHue preprocessing layer. -Operates on the oxford_flowers102 dataset. In this script the flowers +Operates on the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/random_saturation_demo.py b/examples/layers/preprocessing/classification/random_saturation_demo.py index 6827e5b145..75f72e8a20 100644 --- a/examples/layers/preprocessing/classification/random_saturation_demo.py +++ b/examples/layers/preprocessing/classification/random_saturation_demo.py @@ -11,9 +11,10 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -"""random_saturation_demo.py shows how to use the RandomSaturation preprocessing layer. +"""random_saturation_demo.py shows how to use the RandomSaturation preprocessing +layer. -Operates on the oxford_flowers102 dataset. In this script the flowers +Operates on the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/layers/preprocessing/classification/random_shear_demo.py b/examples/layers/preprocessing/classification/random_shear_demo.py index 23ecc02b3d..828c50e5f3 100644 --- a/examples/layers/preprocessing/classification/random_shear_demo.py +++ b/examples/layers/preprocessing/classification/random_shear_demo.py @@ -13,7 +13,7 @@ # limitations under the License. """random_shear_demo.py shows how to use the RandomShear preprocessing layer. -Operates on the oxford_flowers102 dataset. In this script the flowers +Operates on the oxford_flowers102 dataset. In this script the flowers are loaded, then are passed through the preprocessing layers. Finally, they are shown using matplotlib. """ diff --git a/examples/models/generative/stable_diffusion/text_to_image.py b/examples/models/generative/stable_diffusion/text_to_image.py index e95fde98ec..740a8ad205 100644 --- a/examples/models/generative/stable_diffusion/text_to_image.py +++ b/examples/models/generative/stable_diffusion/text_to_image.py @@ -3,7 +3,8 @@ Author: fchollet Date created: 2022/09/24 Last modified: 2022/09/24 -Description: Use StableDiffusion to generate an image according to a short text description. +Description: Use StableDiffusion to generate an image according to a short text + description. """ from PIL import Image diff --git a/examples/training/classification/imagenet/basic_training.py b/examples/training/classification/imagenet/basic_training.py index b498bc50d6..276c928753 100644 --- a/examples/training/classification/imagenet/basic_training.py +++ b/examples/training/classification/imagenet/basic_training.py @@ -16,7 +16,8 @@ Author: [ianjjohnson](https://github.com/ianjjohnson) Date created: 2022/07/25 Last modified: 2022/07/25 -Description: Use KerasCV to train an image classifier using modern best practices +Description: Use KerasCV to train an image classifier using modern best + practices """ import math @@ -36,8 +37,10 @@ """ ## Overview -KerasCV makes training state-of-the-art classification models easy by providing implementations of modern models, preprocessing techniques, and layers. -In this tutorial, we walk through training a model against the Imagenet dataset using Keras and KerasCV. +KerasCV makes training state-of-the-art classification models easy by providing +implementations of modern models, preprocessing techniques, and layers. +In this tutorial, we walk through training a model against the Imagenet dataset +using Keras and KerasCV. This tutorial requires you to have KerasCV installed: ```shell pip install keras-cv @@ -71,30 +74,33 @@ flags.DEFINE_integer( "batch_size", 128, - "Batch size for training and evaluation. This will be multiplied by the number of accelerators in use.", + "Batch size for training and evaluation. This will be multiplied by the " + "number of accelerators in use.", ) flags.DEFINE_boolean( - "use_xla", True, "Whether or not to use XLA (jit_compile) for training." + "use_xla", True, "whether to use XLA (jit_compile) for training." ) flags.DEFINE_boolean( "use_mixed_precision", False, - "Whether or not to use FP16 mixed precision for training.", + "whether to use FP16 mixed precision for training.", ) flags.DEFINE_boolean( "use_ema", True, - "Whether or not to use exponential moving average weight updating", + "whether to use exponential moving average weight updating", ) flags.DEFINE_float( "initial_learning_rate", 0.05, - "Initial learning rate which will reduce on plateau. This will be multiplied by the number of accelerators in use", + "Initial learning rate which will reduce on plateau. This will be " + "multiplied by the number of accelerators in use", ) flags.DEFINE_string( "model_kwargs", "{}", - "Keyword argument dictionary to pass to the constructor of the model being trained", + "Keyword argument dictionary to pass to the constructor of the model being " + "trained", ) flags.DEFINE_string( @@ -106,13 +112,16 @@ flags.DEFINE_float( "warmup_steps_percentage", 0.1, - "For how many steps expressed in percentage (0..1 float) of total steps should the schedule warm up if we're using the warmup schedule", + "For how many steps expressed in percentage (0..1 float) of total steps " + "should the schedule warm up if we're using the warmup schedule", ) flags.DEFINE_float( "warmup_hold_steps_percentage", 0.45, - "For how many steps expressed in percentage (0..1 float) of total steps should the schedule hold the initial learning rate after warmup is finished, and before applying cosine decay.", + "For how many steps expressed in percentage (0..1 float) of total steps " + "should the schedule hold the initial learning rate after warmup is " + "finished, and before applying cosine decay.", ) flags.DEFINE_float( @@ -144,7 +153,8 @@ batch size based on the number of accelerators being used. """ -# Try to detect an available TPU. If none is present, default to MirroredStrategy +# Try to detect an available TPU. If none is present, defaults to +# MirroredStrategy try: tpu = tf.distribute.cluster_resolver.TPUClusterResolver.connect() strategy = tf.distribute.TPUStrategy(tpu) @@ -159,7 +169,8 @@ INITIAL_LEARNING_RATE = ( FLAGS.initial_learning_rate * strategy.num_replicas_in_sync ) -"""TFRecord-based tf.data.Dataset loads lazily so we can't get the length of the dataset. Temporary.""" +"""TFRecord-based tf.data.Dataset loads lazily so we can't get the length of +the dataset. Temporary.""" NUM_IMAGES = 1281167 """ @@ -246,9 +257,11 @@ def augment(img, label): """ Optional LR schedule with cosine decay instead of ReduceLROnPlateau -TODO: Replace with Core Keras LRWarmup when it's released. This is a temporary solution. +TODO: Replace with Core Keras LRWarmup when it's released. This is a temporary +solution. -Convinience method for calculating LR at given timestep, for the WarmUpCosineDecay class. +Convenience method for calculating LR at given timestep, for the +WarmUpCosineDecay class. """ @@ -289,19 +302,25 @@ def lr_warmup_cosine_decay( """ -LearningRateSchedule implementing the learning rate warmup with cosine decay strategy. -Learning rate warmup should help with initial training instability, +LearningRateSchedule implementing the learning rate warmup with cosine decay +strategy. Learning rate warmup should help with initial training instability, while the decay strategy may be variable, cosine being a popular choice. -The schedule will start from 0.0 (or supplied start_lr) and gradually "warm up" linearly to the target_lr. -From there, it will apply a cosine decay to the learning rate, after an optional holding period. +The schedule will start from 0.0 (or supplied start_lr) and gradually "warm up" +linearly to the target_lr. From there, it will apply a cosine decay to the +learning rate, after an optional holding period. args: - - [float] start_lr: default 0.0, the starting learning rate at the beginning of training from which the warmup starts - - [float] target_lr: default 1e-2, the target (initial) learning rate from which you'd usually start without a LR warmup schedule - - [int] warmup_steps: number of training steps to warm up for expressed in batches - - [int] total_steps: the total steps (epochs * number of batches per epoch) in the dataset - - [int] hold: optional argument to hold the target_lr before applying cosine decay on it + - [float] start_lr: default 0.0, the starting learning rate at the beginning + of training from which the warmup starts + - [float] target_lr: default 1e-2, the target (initial) learning rate from + which you'd usually start without a LR warmup schedule + - [int] warmup_steps: number of training steps to warm up for expressed in + batches + - [int] total_steps: the total steps (epochs * number of batches per epoch) + in the dataset + - [int] hold: optional argument to hold the target_lr before applying cosine + decay on it """ @@ -343,7 +362,8 @@ def __call__(self, step): """ Next, we pick an optimizer. Here we use SGD. -Note that learning rate will decrease over time due to the ReduceLROnPlateau callback or with the LRWarmup scheduler. +Note that learning rate will decrease over time due to the ReduceLROnPlateau +callback or with the LRWarmup scheduler. """ with strategy.scope(): @@ -364,13 +384,15 @@ def __call__(self, step): ) """ -Next, we pick a loss function. We use CategoricalCrossentropy with label smoothing. +Next, we pick a loss function. We use CategoricalCrossentropy with label +smoothing. """ loss_fn = losses.CategoricalCrossentropy(label_smoothing=0.1) """ -Next, we specify the metrics that we want to track. For this example, we track accuracy. +Next, we specify the metrics that we want to track. For this example, we track +accuracy. """ with strategy.scope(): training_metrics = [ diff --git a/examples/training/contrastive/imagenet/simclr_training.py b/examples/training/contrastive/imagenet/simclr_training.py index d92d0416dc..8b0f982d24 100644 --- a/examples/training/contrastive/imagenet/simclr_training.py +++ b/examples/training/contrastive/imagenet/simclr_training.py @@ -50,7 +50,7 @@ "batch_size", 256, "Batch size for training and evaluation." ) flags.DEFINE_boolean( - "use_xla", True, "Whether or not to use XLA (jit_compile) for training." + "use_xla", True, "whether to use XLA (jit_compile) for training." ) flags.DEFINE_float( "initial_learning_rate", diff --git a/examples/training/object_detection/pascal_voc/faster_rcnn.py b/examples/training/object_detection/pascal_voc/faster_rcnn.py index f8133a8304..6db2af0a69 100644 --- a/examples/training/object_detection/pascal_voc/faster_rcnn.py +++ b/examples/training/object_detection/pascal_voc/faster_rcnn.py @@ -49,7 +49,8 @@ # parameters from FasterRCNN [paper](https://arxiv.org/pdf/1506.01497.pdf) -# Try to detect an available TPU. If none is present, default to MirroredStrategy +# Try to detect an available TPU. If none is present, defaults to +# MirroredStrategy try: tpu = tf.distribute.cluster_resolver.TPUClusterResolver.connect() strategy = tf.distribute.TPUStrategy(tpu) diff --git a/examples/training/object_detection/pascal_voc/retina_net.py b/examples/training/object_detection/pascal_voc/retina_net.py index f546356472..11876d9c04 100644 --- a/examples/training/object_detection/pascal_voc/retina_net.py +++ b/examples/training/object_detection/pascal_voc/retina_net.py @@ -17,7 +17,7 @@ Date created: 2022/09/27 Last modified: 2023/03/29 Description: Use KerasCV to train a RetinaNet on Pascal VOC 2007. -""" +""" # noqa: E501 import resource import sys @@ -57,7 +57,8 @@ # parameters from RetinaNet [paper](https://arxiv.org/abs/1708.02002) -# Try to detect an available TPU. If none is present, default to MirroredStrategy +# Try to detect an available TPU. If none is present, defaults to +# MirroredStrategy try: tpu = tf.distribute.cluster_resolver.TPUClusterResolver.connect() strategy = tf.distribute.TPUStrategy(tpu) @@ -156,10 +157,11 @@ def pad_fn(inputs): """ ## Model creation -We'll use the KerasCV API to construct a RetinaNet model. In this tutorial we use -a pretrained ResNet50 backbone using weights. In order to perform fine-tuning, we -freeze the backbone before training. When `include_rescaling=True` is set, inputs to -the model are expected to be in the range `[0, 255]`. +We'll use the KerasCV API to construct a RetinaNet model. In this tutorial we +use a pretrained ResNet50 backbone using weights. In order to perform +fine-tuning, we freeze the backbone before training. When +`include_rescaling=True` is set, inputs to the model are expected to be in the +range `[0, 255]`. """ with strategy.scope(): @@ -223,8 +225,8 @@ def on_epoch_end(self, epoch, logs): keras.callbacks.ModelCheckpoint(FLAGS.weights_name, save_weights_only=True), # Temporarily need PyCOCOCallback to verify # a 1:1 comparison with the PyMetrics version. - # Currently, results do not match. I have a feeling this is due - # to how we are creating the boxes in `BoxCOCOMetrics` + # Currently, results do not match. I have a feeling this is due + # to how we are creating the boxes in `BoxCOCOMetrics` PyCOCOCallback(eval_ds, bounding_box_format="xywh"), EvaluateCOCOMetricsCallback(eval_ds), keras.callbacks.TensorBoard(log_dir=FLAGS.tensorboard_path), diff --git a/examples/training/object_detection_3d/waymo/serialize_records.py b/examples/training/object_detection_3d/waymo/serialize_records.py index 08f1e69f16..6ef475aa0b 100644 --- a/examples/training/object_detection_3d/waymo/serialize_records.py +++ b/examples/training/object_detection_3d/waymo/serialize_records.py @@ -19,8 +19,10 @@ from keras_cv.datasets.waymo import build_tensors_for_augmentation from keras_cv.datasets.waymo import load -TRAINING_RECORD_PATH = "./wod_records" # "gs://waymo_open_dataset_v_1_0_0_individual_files/training" -TRANSFORMED_RECORD_PATH = "./wod_transformed" # "gs://waymo_open_dataset_v_1_0_0_individual_files/training" +# "gs://waymo_open_dataset_v_1_0_0_individual_files/training" +TRAINING_RECORD_PATH = "./wod_records" +# "gs://waymo_open_dataset_v_1_0_0_individual_files/training" +TRANSFORMED_RECORD_PATH = "./wod_transformed" def _float_feature(value): @@ -32,8 +34,8 @@ def serialize_example(feature0, feature1): """ Creates a tf.train.Example message ready to be written to a file. """ - # Create a dictionary mapping the feature name to the tf.train.Example-compatible - # data type. + # Create a dictionary mapping the feature name to the + # tf.train.Example-compatible data type. feature = { "point_clouds": _float_feature(tf.reshape(feature0, [-1]).numpy()), "bounding_boxes": _float_feature(tf.reshape(feature1, [-1]).numpy()), diff --git a/examples/training/object_detection_3d/waymo/train_pillars.py b/examples/training/object_detection_3d/waymo/train_pillars.py index 1bd3b14f4e..44145d8940 100644 --- a/examples/training/object_detection_3d/waymo/train_pillars.py +++ b/examples/training/object_detection_3d/waymo/train_pillars.py @@ -19,7 +19,8 @@ from keras_cv.layers import preprocessing3d # use serialize_records to convert WOD frame to Tensors -TRAINING_RECORD_PATH = "./wod_transformed" # "gs://waymo_open_dataset_v_1_0_0_individual_files/training" +# "gs://waymo_open_dataset_v_1_0_0_individual_files/training" +TRAINING_RECORD_PATH = "./wod_transformed" global_batch = 1 @@ -104,7 +105,7 @@ def augment(inputs): # in KerasCV # ### Load the evaluation dataset -# EVALUATION_RECORD_PATH = "./wod-records"#"gs://waymo_open_dataset_v_1_0_0_individual_files/validation" +# EVALUATION_RECORD_PATH = "./wod-records"#"gs://waymo_open_dataset_v_1_0_0_individual_files/validation" # noqa: E501 # eval_ds = load(EVALUATION_RECORD_PATH, simple_transformer, output_signature) # # @@ -113,14 +114,15 @@ def augment(inputs): # with strategy.scope(): # model = None # TODO Need to import model and instantiate it here # -# model.compile(optimizer="adam", loss=None) # TODO need to specify appropriate loss here +# model.compile(optimizer="adam", loss=None) +# TODO need to specify appropriate loss here # # # ### Fit the model with a callback to log scores on our evaluation dataset # model.fit( # train_ds, # callbacks=[ -# # TODO Uncomment when ready from keras_cv.callbacks import WaymoDetectionMetrics +# TODO Uncomment when ready from keras_cv.callbacks import WaymoDetectionMetrics # WaymoDetectionMetrics(eval_ds), # keras.callbacks.TensorBoard(TENSORBOARD_LOGS_PATH), # ], diff --git a/examples/training/semantic_segmentation/pascal_voc/basic_training.py b/examples/training/semantic_segmentation/pascal_voc/basic_training.py index 7392f76845..43219089e4 100644 --- a/examples/training/semantic_segmentation/pascal_voc/basic_training.py +++ b/examples/training/semantic_segmentation/pascal_voc/basic_training.py @@ -12,7 +12,7 @@ # See the License for the specific language governing permissions and # limitations under the License. """ -Title: Train an Semantic Segmentation Model on Pascal VOC 2012 using KerasCV +Title: Train a Semantic Segmentation Model on Pascal VOC 2012 using KerasCV Author: [tanzhenyu](https://github.com/tanzhenyu) Date created: 2022/10/25 Last modified: 2022/10/25 @@ -37,7 +37,7 @@ flags.DEFINE_boolean( "mixed_precision", True, - "Whether or not to use FP16 mixed precision for training.", + "whether to use FP16 mixed precision for training.", ) flags.DEFINE_string( "tensorboard_path", @@ -59,7 +59,8 @@ flags.DEFINE_string( "model_kwargs", "{}", - "Keyword argument dictionary to pass to the constructor of the model being trained", + "Keyword argument dictionary to pass to the constructor of the model being" + " trained", ) FLAGS = flags.FLAGS @@ -69,7 +70,8 @@ logging.info("mixed precision training enabled") keras.mixed_precision.set_global_policy("mixed_float16") -# Try to detect an available TPU. If none is present, default to MirroredStrategy +# Try to detect an available TPU. If none is present, defaults to +# MirroredStrategy try: tpu = tf.distribute.cluster_resolver.TPUClusterResolver.connect() strategy = tf.distribute.TPUStrategy(tpu) diff --git a/examples/visualization/plot_bounding_box_gallery.py b/examples/visualization/plot_bounding_box_gallery.py index 6fcae2787e..c8a2b92f44 100644 --- a/examples/visualization/plot_bounding_box_gallery.py +++ b/examples/visualization/plot_bounding_box_gallery.py @@ -7,8 +7,9 @@ """ """ -`keras_cv.visualization.plot_bounding_box_gallery()` is a function dedicated to the -visualization of bounding boxes predicted by a `keras_cv` object detection model. +`keras_cv.visualization.plot_bounding_box_gallery()` is a function dedicated to +the visualization of bounding boxes predicted by a `keras_cv` object detection +model. """ import tensorflow as tf diff --git a/examples/visualization/plot_image_gallery.py b/examples/visualization/plot_image_gallery.py index 41a92c35fc..297cc6d0a8 100644 --- a/examples/visualization/plot_image_gallery.py +++ b/examples/visualization/plot_image_gallery.py @@ -3,11 +3,12 @@ Author: [lukewood](https://lukewood.xyz) Date created: 2022/10/16 Last modified: 2022/10/16 -Description: Visualize ground truth and predicted bounding boxes for a given dataset. +Description: Visualize ground truth and predicted bounding boxes for a given + dataset. """ """ -Plotting images from a TensorFlow dataset is easy with KerasCV. Behold: +Plotting images from a TensorFlow dataset is easy with KerasCV. Behold: """ import tensorflow as tf diff --git a/keras_cv/bounding_box/converters.py b/keras_cv/bounding_box/converters.py index 9a93fb2675..46c59f7b89 100644 --- a/keras_cv/bounding_box/converters.py +++ b/keras_cv/bounding_box/converters.py @@ -21,8 +21,8 @@ from tensorflow import keras -# Internal exception to propagate the fact images was not passed to a converter that -# needs it +# Internal exception to propagate the fact images was not passed to a converter +# that needs it. class RequiresImagesException(Exception): pass @@ -300,35 +300,36 @@ def convert_format( f"""Converts bounding_boxes from one format to another. Supported formats are: - - `"xyxy"`, also known as `corners` format. In this format the first four axes - represent `[left, top, right, bottom]` in that order. - - `"rel_xyxy"`. In this format, the axes are the same as `"xyxy"` but the x - coordinates are normalized using the image width, and the y axes the image - height. All values in `rel_xyxy` are in the range `(0, 1)`. - - `"xywh"`. In this format the first four axes represent + - `"xyxy"`, also known as `corners` format. In this format the first four + axes represent `[left, top, right, bottom]` in that order. + - `"rel_xyxy"`. In this format, the axes are the same as `"xyxy"` but the x + coordinates are normalized using the image width, and the y axes the + image height. All values in `rel_xyxy` are in the range `(0, 1)`. + - `"xywh"`. In this format the first four axes represent `[left, top, width, height]`. - - `"rel_xywh". In this format the first four axes represent - [left, top, width, height], just like `"xywh"`. Unlike `"xywh"`, the values - are in the range (0, 1) instead of absolute pixel values. - - `"center_xyWH"`. In this format the first two coordinates represent the x and y - coordinates of the center of the bounding box, while the last two represent - the width and height of the bounding box. - - `"center_yxHW"`. In this format the first two coordinates represent the y and x - coordinates of the center of the bounding box, while the last two represent - the height and width of the bounding box. - - `"yxyx"`. In this format the first four axes represent [top, left, bottom, right] - in that order. - - `"rel_yxyx"`. In this format, the axes are the same as `"yxyx"` but the x - coordinates are normalized using the image width, and the y axes the image - height. All values in `rel_yxyx` are in the range (0, 1). - Formats are case insensitive. It is recommended that you capitalize width and - height to maximize the visual difference between `"xyWH"` and `"xyxy"`. - - Relative formats, abbreviated `rel`, make use of the shapes of the `images` passed. - In these formats, the coordinates, widths, and heights are all specified as - percentages of the host image. `images` may be a ragged Tensor. Note that using a - ragged Tensor for images may cause a substantial performance loss, as each image - will need to be processed separately due to the mismatching image shapes. + - `"rel_xywh". In this format the first four axes represent + [left, top, width, height], just like `"xywh"`. Unlike `"xywh"`, the + values are in the range (0, 1) instead of absolute pixel values. + - `"center_xyWH"`. In this format the first two coordinates represent the x + and y coordinates of the center of the bounding box, while the last two + represent the width and height of the bounding box. + - `"center_yxHW"`. In this format the first two coordinates represent the y + and x coordinates of the center of the bounding box, while the last two + represent the height and width of the bounding box. + - `"yxyx"`. In this format the first four axes represent + [top, left, bottom, right] in that order. + - `"rel_yxyx"`. In this format, the axes are the same as `"yxyx"` but the x + coordinates are normalized using the image width, and the y axes the + image height. All values in `rel_yxyx` are in the range (0, 1). + Formats are case insensitive. It is recommended that you capitalize width + and height to maximize the visual difference between `"xyWH"` and `"xyxy"`. + + Relative formats, abbreviated `rel`, make use of the shapes of the `images` + passed. In these formats, the coordinates, widths, and heights are all + specified as percentages of the host image. `images` may be a ragged + Tensor. Note that using a ragged Tensor for images may cause a substantial + performance loss, as each image will need to be processed separately due to + the mismatching image shapes. Usage: @@ -342,22 +343,23 @@ def convert_format( ``` Args: - boxes: tf.Tensor representing bounding boxes in the format specified in the - `source` parameter. `boxes` can optionally have extra dimensions stacked on - the final axis to store metadata. boxes should be a 3D Tensor, with the - shape `[batch_size, num_boxes, 4]`. Alternatively, boxes can be a - dictionary with key 'boxes' containing a Tensor matching the aforementioned - spec. - source: One of {" ".join([f'"{f}"' for f in TO_XYXY_CONVERTERS.keys()])}. Used - to specify the original format of the `boxes` parameter. - target: One of {" ".join([f'"{f}"' for f in TO_XYXY_CONVERTERS.keys()])}. Used - to specify the destination format of the `boxes` parameter. - images: (Optional) a batch of images aligned with `boxes` on the first axis. - Should be at least 3 dimensions, with the first 3 dimensions representing: - `[batch_size, height, width]`. Used in some converters to compute relative - pixel values of the bounding box dimensions. Required when transforming - from a rel format to a non-rel format. - dtype: the data type to use when transforming the boxes. Defaults to + boxes: tf.Tensor representing bounding boxes in the format specified in + the `source` parameter. `boxes` can optionally have extra + dimensions stacked on the final axis to store metadata. boxes + should be a 3D Tensor, with the shape `[batch_size, num_boxes, 4]`. + Alternatively, boxes can be a dictionary with key 'boxes' containing + a Tensor matching the aforementioned spec. + source:One of {" ".join([f'"{f}"' for f in TO_XYXY_CONVERTERS.keys()])}. + Used to specify the original format of the `boxes` parameter. + target:One of {" ".join([f'"{f}"' for f in TO_XYXY_CONVERTERS.keys()])}. + Used to specify the destination format of the `boxes` parameter. + images: (Optional) a batch of images aligned with `boxes` on the first + axis. Should be at least 3 dimensions, with the first 3 dimensions + representing: `[batch_size, height, width]`. Used in some + converters to compute relative pixel values of the bounding box + dimensions. Required when transforming from a rel format to a + non-rel format. + dtype: the data type to use when transforming the boxes, defaults to `tf.float32`. """ if isinstance(boxes, dict): @@ -373,13 +375,13 @@ def convert_format( if boxes.shape[-1] != 4: raise ValueError( - "Expected `boxes` to be a Tensor with a " - f"final dimension of `4`. Instead, got `boxes.shape={boxes.shape}`." + "Expected `boxes` to be a Tensor with a final dimension of " + f"`4`. Instead, got `boxes.shape={boxes.shape}`." ) if images is not None and image_shape is not None: raise ValueError( - "convert_format() expects either `images` or `image_shape`, " - f"but not both. Received images={images} image_shape={image_shape}" + "convert_format() expects either `images` or `image_shape`, but " + f"not both. Received images={images} image_shape={image_shape}" ) _validate_image_shape(image_shape) @@ -388,15 +390,15 @@ def convert_format( target = target.lower() if source not in TO_XYXY_CONVERTERS: raise ValueError( - f"`convert_format()` received an unsupported format for the argument " - f"`source`. `source` should be one of {TO_XYXY_CONVERTERS.keys()}. " - f"Got source={source}" + "`convert_format()` received an unsupported format for the " + "argument `source`. `source` should be one of " + f"{TO_XYXY_CONVERTERS.keys()}. Got source={source}" ) if target not in FROM_XYXY_CONVERTERS: raise ValueError( - f"`convert_format()` received an unsupported format for the argument " - f"`target`. `target` should be one of {FROM_XYXY_CONVERTERS.keys()}. " - f"Got target={target}" + "`convert_format()` received an unsupported format for the " + "argument `target`. `target` should be one of " + f"{FROM_XYXY_CONVERTERS.keys()}. Got target={target}" ) boxes = tf.cast(boxes, dtype) @@ -417,8 +419,8 @@ def convert_format( result = from_xyxy_fn(in_xyxy, images=images, image_shape=image_shape) except RequiresImagesException: raise ValueError( - "convert_format() must receive `images` or `image_shape` when transforming " - f"between relative and absolute formats." + "convert_format() must receive `images` or `image_shape` when " + "transforming between relative and absolute formats." f"convert_format() received source=`{format}`, target=`{format}, " f"but images={images} and image_shape={image_shape}." ) @@ -445,10 +447,12 @@ def _format_inputs(boxes, images): images_include_batch = images_rank == 4 if boxes_includes_batch != images_include_batch: raise ValueError( - "convert_format() expects both boxes and images to be batched, or both " - f"boxes and images to be unbatched. Received len(boxes.shape)={boxes_rank}, " - f"len(images.shape)={images_rank}. Expected either len(boxes.shape)=2 AND " - "len(images.shape)=3, or len(boxes.shape)=3 AND len(images.shape)=4." + "convert_format() expects both boxes and images to be batched, " + "or both boxes and images to be unbatched. Received " + f"len(boxes.shape)={boxes_rank}, " + f"len(images.shape)={images_rank}. Expected either " + "len(boxes.shape)=2 AND len(images.shape)=3, or " + "len(boxes.shape)=3 AND len(images.shape)=4." ) if not images_include_batch: images = tf.expand_dims(images, axis=0) @@ -487,7 +491,7 @@ def _validate_image_shape(image_shape): # Warn about failure cases raise ValueError( - "Expected image_shape to be either a tuple, list, Tensor. " + "Expected image_shape to be either a tuple, list, Tensor. " f"Received image_shape={image_shape}" ) diff --git a/keras_cv/bounding_box/formats.py b/keras_cv/bounding_box/formats.py index 1dda0e81bb..04a23cd364 100644 --- a/keras_cv/bounding_box/formats.py +++ b/keras_cv/bounding_box/formats.py @@ -23,7 +23,7 @@ class XYXY: The XYXY format consists of the following required indices: - - LEFT: left hand side of the bounding box + - LEFT: left of the bounding box - TOP: top of the bounding box - RIGHT: right of the bounding box - BOTTOM: bottom of the bounding box @@ -38,13 +38,13 @@ class XYXY: class REL_XYXY: """REL_XYXY contains axis indices for the REL_XYXY format. - REL_XYXY is like XYXY, but each value is relative to the width and height of the - origin image. Values are percentages of the origin images' width and height - respectively. + REL_XYXY is like XYXY, but each value is relative to the width and height of + the origin image. Values are percentages of the origin images' width and + height respectively. The REL_XYXY format consists of the following required indices: - - LEFT: left hand side of the bounding box + - LEFT: left of the bounding box - TOP: top of the bounding box - RIGHT: right of the bounding box - BOTTOM: bottom of the bounding box @@ -97,9 +97,9 @@ class XYWH: class REL_XYWH: """REL_XYWH contains axis indices for the XYWH format. - REL_XYXY is like XYWH, but each value is relative to the width and height of the - origin image. Values are percentages of the origin images' width and height - respectively. + REL_XYXY is like XYWH, but each value is relative to the width and height of + the origin image. Values are percentages of the origin images' width and + height respectively. - X: X coordinate of the left of the bounding box - Y: Y coordinate of the top of the bounding box @@ -121,7 +121,7 @@ class YXYX: The YXYX format consists of the following required indices: - TOP: top of the bounding box - - LEFT: left hand side of the bounding box + - LEFT: left of the bounding box - BOTTOM: bottom of the bounding box - RIGHT: right of the bounding box """ @@ -135,14 +135,14 @@ class YXYX: class REL_YXYX: """REL_YXYX contains axis indices for the REL_YXYX format. - REL_YXYX is like YXYX, but each value is relative to the width and height of the - origin image. Values are percentages of the origin images' width and height - respectively. + REL_YXYX is like YXYX, but each value is relative to the width and height of + the origin image. Values are percentages of the origin images' width and + height respectively. The REL_YXYX format consists of the following required indices: - TOP: top of the bounding box - - LEFT: left hand side of the bounding box + - LEFT: left of the bounding box - BOTTOM: bottom of the bounding box - RIGHT: right of the bounding box """ diff --git a/keras_cv/bounding_box/iou.py b/keras_cv/bounding_box/iou.py index 34bc292132..d0c7634358 100644 --- a/keras_cv/bounding_box/iou.py +++ b/keras_cv/bounding_box/iou.py @@ -73,48 +73,55 @@ def compute_iou( ): """Computes a lookup table vector containing the ious for a given set boxes. - The lookup vector is to be indexed by [`boxes1_index`,`boxes2_index`] if boxes - are unbatched and by [`batch`, `boxes1_index`,`boxes2_index`] if the boxes are - batched. + The lookup vector is to be indexed by [`boxes1_index`,`boxes2_index`] if + boxes are unbatched and by [`batch`, `boxes1_index`,`boxes2_index`] if the + boxes are batched. The users can pass `boxes1` and `boxes2` to be different ranks. For example: - 1) `boxes1`: [batch_size, M, 4], `boxes2`: [batch_size, N, 4] -> return [batch_size, M, N]. - 2) `boxes1`: [batch_size, M, 4], `boxes2`: [N, 4] -> return [batch_size, M, N] - 3) `boxes1`: [M, 4], `boxes2`: [batch_size, N, 4] -> return [batch_size, M, N] + 1) `boxes1`: [batch_size, M, 4], `boxes2`: [batch_size, N, 4] -> return + [batch_size, M, N]. + 2) `boxes1`: [batch_size, M, 4], `boxes2`: [N, 4] -> return + [batch_size, M, N] + 3) `boxes1`: [M, 4], `boxes2`: [batch_size, N, 4] -> return + [batch_size, M, N] 4) `boxes1`: [M, 4], `boxes2`: [N, 4] -> return [M, N] Args: - boxes1: a list of bounding boxes in 'corners' format. Can be batched or unbatched. - boxes2: a list of bounding boxes in 'corners' format. Can be batched or unbatched. + boxes1: a list of bounding boxes in 'corners' format. Can be batched or + unbatched. + boxes2: a list of bounding boxes in 'corners' format. Can be batched or + unbatched. bounding_box_format: a case-insensitive string which is one of `"xyxy"`, `"rel_xyxy"`, `"xyWH"`, `"center_xyWH"`, `"yxyx"`, `"rel_yxyx"`. For detailed information on the supported format, see the [KerasCV bounding box documentation](https://keras.io/api/keras_cv/bounding_box/formats/). - use_masking: whether masking will be applied. This will mask all `boxes1` or `boxes2` that - have values less then 0 in all its 4 dimensions. Default to `False`. - mask_val: int to mask those returned IOUs if the masking is True. Default to -1. + use_masking: whether masking will be applied. This will mask all `boxes1` + or `boxes2` that have values less than 0 in all its 4 dimensions. + Default to `False`. + mask_val: int to mask those returned IOUs if the masking is True, defaults + to -1. Returns: iou_lookup_table: a vector containing the pairwise ious of boxes1 and boxes2. - """ + """ # noqa: E501 boxes1_rank = len(boxes1.shape) boxes2_rank = len(boxes2.shape) if boxes1_rank not in [2, 3]: raise ValueError( - "compute_iou() expects boxes1 to be batched, or " - f"to be unbatched. Received len(boxes1.shape)={boxes1_rank}, " - f"len(boxes2.shape)={boxes2_rank}. Expected either len(boxes1.shape)=2 AND " - "or len(boxes1.shape)=3." + "compute_iou() expects boxes1 to be batched, or to be unbatched. " + f"Received len(boxes1.shape)={boxes1_rank}, " + f"len(boxes2.shape)={boxes2_rank}. Expected either " + "len(boxes1.shape)=2 AND or len(boxes1.shape)=3." ) if boxes2_rank not in [2, 3]: raise ValueError( - "compute_iou() expects boxes2 to be batched, or " - f"to be unbatched. Received len(boxes1.shape)={boxes1_rank}, " - f"len(boxes2.shape)={boxes2_rank}. Expected either len(boxes2.shape)=2 AND " - "or len(boxes2.shape)=3." + "compute_iou() expects boxes2 to be batched, or to be unbatched. " + f"Received len(boxes1.shape)={boxes1_rank}, " + f"len(boxes2.shape)={boxes2_rank}. Expected either " + "len(boxes2.shape)=2 AND or len(boxes2.shape)=3." ) target_format = "yxyx" diff --git a/keras_cv/bounding_box/mask_invalid_detections.py b/keras_cv/bounding_box/mask_invalid_detections.py index 52f3f4b49f..91df9c19a4 100644 --- a/keras_cv/bounding_box/mask_invalid_detections.py +++ b/keras_cv/bounding_box/mask_invalid_detections.py @@ -20,34 +20,38 @@ def mask_invalid_detections(bounding_boxes, output_ragged=False): """masks out invalid detections with -1s. - This utility is mainly used on the output of `tf.image.combined_non_max_suppression` - operations. The output of `tf.image.combined_non_max_suppression` contains all of - the detections, even invalid ones. Users are expected to use `num_detections` to - determine how many boxes are in each image. + This utility is mainly used on the output of + `tf.image.combined_non_max_suppression` operations. The output of + `tf.image.combined_non_max_suppression` contains all the detections, even + invalid ones. Users are expected to use `num_detections` to determine how + many boxes are in each image. In contrast, KerasCV expects all bounding boxes to be padded with -1s. This function uses the value of `num_detections` to mask out invalid boxes with -1s. Args: - bounding_boxes: a dictionary complying with KerasCV bounding box format. In - addition to the normal required keys, these boxes are also expected to have - a `num_detections` key. - output_ragged: whether or not to output RaggedTensor based bounding boxes. + bounding_boxes: a dictionary complying with KerasCV bounding box format. + In addition to the normal required keys, these boxes are also + expected to have a `num_detections` key. + output_ragged: whether to output RaggedTensor based bounding + boxes. Returns: - bounding boxes with proper masking of the boxes according to `num_detections`. - This allows proper interop with `tf.image.combined_non_max_suppression`. - Returned boxes match the specification fed to the function, so if the bounding - box tensor uses `tf.RaggedTensor` to represent boxes the returned value will - also return `tf.RaggedTensor` representations. + bounding boxes with proper masking of the boxes according to + `num_detections`. This allows proper interop with + `tf.image.combined_non_max_suppression`. Returned boxes match the + specification fed to the function, so if the bounding box tensor uses + `tf.RaggedTensor` to represent boxes the returned value will also return + `tf.RaggedTensor` representations. """ # ensure we are complying with KerasCV bounding box format. info = validate_format(bounding_boxes) if info["ragged"]: raise ValueError( - "`bounding_box.mask_invalid_detections()` requires inputs to be Dense " - "tensors. Please call `bounding_box.to_dense(bounding_boxes)` before " - "passing your boxes to `bounding_box.mask_invalid_detections()`." + "`bounding_box.mask_invalid_detections()` requires inputs to be " + "Dense tensors. Please call " + "`bounding_box.to_dense(bounding_boxes)` before passing your boxes " + "to `bounding_box.mask_invalid_detections()`." ) if "num_detections" not in bounding_boxes: raise ValueError( @@ -69,7 +73,7 @@ def mask_invalid_detections(bounding_boxes, output_ragged=False): classes = tf.where(mask, classes, -tf.ones_like(classes)) - # resuse mask for boxes + # reuse mask for boxes mask = tf.expand_dims(mask, axis=-1) mask = tf.repeat(mask, repeats=boxes.shape[-1], axis=-1) boxes = tf.where(mask, boxes, -tf.ones_like(boxes)) diff --git a/keras_cv/bounding_box/to_dense.py b/keras_cv/bounding_box/to_dense.py index e243d922b6..76bc8410d2 100644 --- a/keras_cv/bounding_box/to_dense.py +++ b/keras_cv/bounding_box/to_dense.py @@ -41,9 +41,9 @@ def to_dense(bounding_boxes, max_boxes=None, default_value=-1): Args: bounding_boxes: bounding boxes in KerasCV dictionary format. max_boxes: the maximum number of boxes, used to pad tensors to a given - shape. This can be used to make object detection pipelines TPU + shape. This can be used to make object detection pipelines TPU compatible. - default_value: the default value to pad bounding boxes with. defaults + default_value: the default value to pad bounding boxes with. defaults to -1. """ info = validate_format.validate_format(bounding_boxes) diff --git a/keras_cv/bounding_box/to_ragged.py b/keras_cv/bounding_box/to_ragged.py index 4c2688f1ad..fedd9f20ac 100644 --- a/keras_cv/bounding_box/to_ragged.py +++ b/keras_cv/bounding_box/to_ragged.py @@ -19,10 +19,11 @@ def to_ragged(bounding_boxes, sentinel=-1, dtype=tf.float32): """converts a Dense padded bounding box `tf.Tensor` to a `tf.RaggedTensor`. - Bounding boxes are ragged tensors in most use cases. Converting them to a dense - tensor makes it easier to work with Tensorflow ecosystem. + Bounding boxes are ragged tensors in most use cases. Converting them to a + dense tensor makes it easier to work with Tensorflow ecosystem. This function can be used to filter out the masked out bounding boxes by - checking for padded sentinel value of the class_id axis of the bounding_boxes. + checking for padded sentinel value of the class_id axis of the + bounding_boxes. Usage: ```python @@ -39,13 +40,14 @@ def to_ragged(bounding_boxes, sentinel=-1, dtype=tf.float32): ``` Args: - bounding_boxes: a Tensor of bounding boxes. May be batched, or unbatched. - sentinel: The value indicating that a bounding box does not exist at the current - index, and the corresponding box is padding. Defaults to -1. + bounding_boxes: a Tensor of bounding boxes. May be batched, or + unbatched. + sentinel: The value indicating that a bounding box does not exist at the + current index, and the corresponding box is padding, defaults to -1. dtype: the data type to use for the underlying Tensors. Returns: - dictionary of `tf.RaggedTensor` or 'tf.Tensor' containing the filtered bounding - boxes. + dictionary of `tf.RaggedTensor` or 'tf.Tensor' containing the filtered + bounding boxes. """ info = validate_format.validate_format(bounding_boxes) diff --git a/keras_cv/bounding_box/utils.py b/keras_cv/bounding_box/utils.py index 24750af47b..a03915a8ca 100644 --- a/keras_cv/bounding_box/utils.py +++ b/keras_cv/bounding_box/utils.py @@ -27,7 +27,7 @@ def is_relative(bounding_box_format): ): raise ValueError( "`is_relative()` received an unsupported format for the argument " - f"`bounding_box_format`. `bounding_box_format` should be one of " + f"`bounding_box_format`. `bounding_box_format` should be one of " f"{bounding_box.converters.TO_XYXY_CONVERTERS.keys()}. " f"Got bounding_box_format={bounding_box_format}" ) @@ -69,15 +69,16 @@ def clip_to_image( ): """clips bounding boxes to image boundaries. - `clip_to_image()` clips bounding boxes that have coordinates out of bounds of an - image down to the boundaries of the image. This is done by converting the bounding - box to relative formats, then clipping them to the `[0, 1]` range. Additionally, - bounding boxes that end up with a zero area have their class ID set to -1, - indicating that there is no object present in them. + `clip_to_image()` clips bounding boxes that have coordinates out of bounds + of an image down to the boundaries of the image. This is done by converting + the bounding box to relative formats, then clipping them to the `[0, 1]` + range. Additionally, bounding boxes that end up with a zero area have their + class ID set to -1, indicating that there is no object present in them. Args: bounding_boxes: bounding box tensor to clip. - bounding_box_format: the KerasCV bounding box format the bounding boxes are in. + bounding_box_format: the KerasCV bounding box format the bounding boxes + are in. images: list of images to clip the bounding boxes to. image_shape: the shape of the images to clip the bounding boxes to. """ @@ -170,10 +171,12 @@ def _format_inputs(boxes, classes, images): images_include_batch = images_rank == 4 if boxes_includes_batch != images_include_batch: raise ValueError( - "clip_to_image() expects both boxes and images to be batched, or both " - f"boxes and images to be unbatched. Received len(boxes.shape)={boxes_rank}, " - f"len(images.shape)={images_rank}. Expected either len(boxes.shape)=2 AND " - "len(images.shape)=3, or len(boxes.shape)=3 AND len(images.shape)=4." + "clip_to_image() expects both boxes and images to be batched, " + "or both boxes and images to be unbatched. Received " + f"len(boxes.shape)={boxes_rank}, " + f"len(images.shape)={images_rank}. Expected either " + "len(boxes.shape)=2 AND len(images.shape)=3, or " + "len(boxes.shape)=3 AND len(images.shape)=4." ) if not images_include_batch: images = tf.expand_dims(images, axis=0) diff --git a/keras_cv/bounding_box/validate_format.py b/keras_cv/bounding_box/validate_format.py index 4bc8faa3d1..c74e87beea 100644 --- a/keras_cv/bounding_box/validate_format.py +++ b/keras_cv/bounding_box/validate_format.py @@ -15,14 +15,16 @@ def validate_format(bounding_boxes, variable_name="bounding_boxes"): - """validates that a given set of bounding boxes complies with KerasCV format. + """validates that a given set of bounding boxes complies with KerasCV + format. - For a set of bounding boxes to be valid it must satisfy the following conditions: + For a set of bounding boxes to be valid it must satisfy the following + conditions: - `bounding_boxes` must be a dictionary - contains keys `"boxes"` and `"classes"` - - each entry must have matching first two dimensions; representing the batch axis - and the number of boxes per image axis. - - either both `"boxes"` and `"classes"` are batched, or both are unbatched + - each entry must have matching first two dimensions; representing the batch + axis and the number of boxes per image axis. + - either both `"boxes"` and `"classes"` are batched, or both are unbatched. Additionally, one of the following must be satisfied: - `"boxes"` and `"classes"` are both Ragged @@ -30,7 +32,8 @@ def validate_format(bounding_boxes, variable_name="bounding_boxes"): - `"boxes"` and `"classes"` are unbatched Args: - bounding_boxes: dictionary of bounding boxes according to KerasCV format. + bounding_boxes: dictionary of bounding boxes according to KerasCV + format. Raises: ValueError if any of the above conditions are not met @@ -43,7 +46,7 @@ def validate_format(bounding_boxes, variable_name="bounding_boxes"): if not all([x in bounding_boxes for x in ["boxes", "classes"]]): raise ValueError( f"Expected `{variable_name}` to be a dictionary containing keys " - "`'classes'` and `'boxes'`. Got " + "`'classes'` and `'boxes'`. Got " f"`{variable_name}.keys()={bounding_boxes.keys()}`." ) @@ -59,8 +62,8 @@ def validate_format(bounding_boxes, variable_name="bounding_boxes"): if boxes.shape[:1] != classes.shape[:1]: raise ValueError( "Expected `boxes` and `classes` to have matching dimensions " - "on the first axis when operating in unbatched mode. " - f"Got `boxes.shape={boxes.shape}`, `classes.shape={classes.shape}`." + "on the first axis when operating in unbatched mode. Got " + f"`boxes.shape={boxes.shape}`, `classes.shape={classes.shape}`." ) info["classes_one_hot"] = len(classes.shape) == 2 diff --git a/keras_cv/bounding_box_3d/formats.py b/keras_cv/bounding_box_3d/formats.py index 4b39cd05b0..c9b34dbfe2 100644 --- a/keras_cv/bounding_box_3d/formats.py +++ b/keras_cv/bounding_box_3d/formats.py @@ -17,7 +17,8 @@ class CENTER_XYZ_DXDYDZ_PHI: - """CENTER_XYZ_DXDYDZ_PHI contains axis indices for the CENTER_XYZ_DXDYDZ_PHI format. + """CENTER_XYZ_DXDYDZ_PHI contains axis indices for the CENTER_XYZ_DXDYDZ_PHI + format. CENTER_XYZ_DXDYDZ_PHI is a 3D box format that supports vertical boxes with a heading rotated around the Z axis. diff --git a/keras_cv/callbacks/__init__.py b/keras_cv/callbacks/__init__.py index f2d19d5e0d..735cc796b7 100644 --- a/keras_cv/callbacks/__init__.py +++ b/keras_cv/callbacks/__init__.py @@ -15,7 +15,8 @@ from keras_cv.callbacks.pycoco_callback import PyCOCOCallback except ImportError: print( - "You do not have pyococotools installed, so the `PyCOCOCallback` API is not available." + "You do not have pyococotools installed, so the `PyCOCOCallback` API is" + "not available." ) from keras_cv.callbacks.waymo_evaluation_callback import WaymoEvaluationCallback diff --git a/keras_cv/callbacks/pycoco_callback.py b/keras_cv/callbacks/pycoco_callback.py index ac72a5c9fb..437e853378 100644 --- a/keras_cv/callbacks/pycoco_callback.py +++ b/keras_cv/callbacks/pycoco_callback.py @@ -22,21 +22,23 @@ class PyCOCOCallback(Callback): def __init__( self, validation_data, bounding_box_format, cache=True, **kwargs ): - """Creates a callback to evaluate PyCOCO metrics on a validation dataset. + """Creates a callback to evaluate PyCOCO metrics on a validation + dataset. Args: - validation_data: a tf.data.Dataset containing validation data. Entries - should have the form ```(images, {"boxes": boxes, + validation_data: a tf.data.Dataset containing validation data. + Entries should have the form ```(images, {"boxes": boxes, "classes": classes})```. bounding_box_format: the KerasCV bounding box format used in the validation dataset (e.g. "xywh") - cache: whether the callback should cache the dataset between iterations. - Note that if the validation dataset has shuffling of any kind - (e.g from `shuffle_files=True` in a call to TFDS.load or a call - to tf.data.Dataset.shuffle() with `reshuffle_each_iteration=True`), - you **must** cache the dataset to preserve iteration order. This - will store your entire dataset in main memory, so for large datasets - consider avoiding shuffle operations and passing `cache=False`. + cache: whether the callback should cache the dataset between + iterations. Note that if the validation dataset has shuffling of + any kind (e.g. from `shuffle_files=True` in a call to TFDS). + Load or a call to tf.data.Dataset.shuffle() with + `reshuffle_each_iteration=True`), you **must** cache the dataset + to preserve iteration order. This will store your entire dataset + in main memory, so for large datasets consider avoiding shuffle + operations and passing `cache=False`. """ self.model = None self.val_data = validation_data @@ -85,27 +87,26 @@ def boxes_only(images, boxes): tf.linspace(1, total_images, total_images), precision=0 ) - ground_truth = {} - ground_truth["source_id"] = [source_ids] - ground_truth["height"] = [ - tf.tile(tf.constant([height]), [total_images]) - ] - ground_truth["width"] = [tf.tile(tf.constant([width]), [total_images])] + ground_truth = { + "source_id": [source_ids], + "height": [tf.tile(tf.constant([height]), [total_images])], + "width": [tf.tile(tf.constant([width]), [total_images])], + "num_detections": [gt_boxes.row_lengths(axis=1)], + "boxes": [gt_boxes.to_tensor(-1)], + "classes": [gt_classes.to_tensor(-1)], + } - ground_truth["num_detections"] = [gt_boxes.row_lengths(axis=1)] - ground_truth["boxes"] = [gt_boxes.to_tensor(-1)] - ground_truth["classes"] = [gt_classes.to_tensor(-1)] box_pred = bounding_box.convert_format( box_pred, source=self.bounding_box_format, target="yxyx" ) - predictions = {} - - predictions["source_id"] = [source_ids] - predictions["detection_boxes"] = [box_pred] - predictions["detection_classes"] = [cls_pred] - predictions["detection_scores"] = [confidence_pred] - predictions["num_detections"] = [valid_det] + predictions = { + "source_id": [source_ids], + "detection_boxes": [box_pred], + "detection_classes": [cls_pred], + "detection_scores": [confidence_pred], + "num_detections": [valid_det], + } metrics = compute_pycoco_metrics(ground_truth, predictions) # Mark these as validation metrics by prepending a val_ prefix diff --git a/keras_cv/callbacks/pycoco_callback_test.py b/keras_cv/callbacks/pycoco_callback_test.py index 212715424a..e4557890c5 100644 --- a/keras_cv/callbacks/pycoco_callback_test.py +++ b/keras_cv/callbacks/pycoco_callback_test.py @@ -61,9 +61,9 @@ def test_model_fit_retinanet(self): ) @pytest.mark.skip( - reason="Causing OOMs on GitHub actions. This is not a " - "user facing API and will be replaced in a matter of weeks, so we " - "shouldn't invest engineering resources into working around the OOMs here." + reason="Causing OOMs on GitHub actions. This is not a user facing API " + "and will be replaced in a matter of weeks, so we shouldn't " + "invest engineering resources into working around the OOMs here." ) def test_model_fit_rcnn(self): model = keras_cv.models.FasterRCNN( diff --git a/keras_cv/callbacks/waymo_evaluation_callback.py b/keras_cv/callbacks/waymo_evaluation_callback.py index 9e36742217..0dd8b87971 100644 --- a/keras_cv/callbacks/waymo_evaluation_callback.py +++ b/keras_cv/callbacks/waymo_evaluation_callback.py @@ -32,9 +32,10 @@ def __init__(self, validation_data, config=None, **kwargs): validation dataset. Args: - validation_data: a tf.data.Dataset containing validation data. Entries - should have the form `(point_clouds, {"bounding_boxes": bounding_boxes}`. - Padded bounding box should have a class of -1 to be correctly filtered out. + validation_data: a tf.data.Dataset containing validation data. + Entries should have the form `(point_clouds, {"bounding_boxes": + bounding_boxes}`. Padded bounding box should have a class of -1 + to be correctly filtered out. config: an optional `metrics_pb2.Config` object from WOD to specify what metrics should be evaluated. """ @@ -56,9 +57,9 @@ def on_epoch_end(self, epoch, logs=None): metrics_dict = { "average_precision": metrics.average_precision, - "average_precision_ha_weighted": metrics.average_precision_ha_weighted, + "average_precision_ha_weighted": metrics.average_precision_ha_weighted, # noqa: E501 "precision_recall": metrics.precision_recall, - "precision_recall_ha_weighted": metrics.precision_recall_ha_weighted, + "precision_recall_ha_weighted": metrics.precision_recall_ha_weighted, # noqa: E501 "breakdown": metrics.breakdown, } @@ -109,19 +110,18 @@ def flatten_target(boxes): frame_ids = tf.cast(tf.linspace(1, num_frames, num_frames), tf.int64) - ground_truth = {} - ground_truth["ground_truth_frame_id"] = tf.boolean_mask( - tf.repeat(frame_ids, boxes_per_gt_frame), gt_real_boxes - ) - ground_truth["ground_truth_bbox"] = gt_boxes[ - :, : CENTER_XYZ_DXDYDZ_PHI.PHI + 1 - ] - ground_truth["ground_truth_type"] = tf.cast( - gt_boxes[:, CENTER_XYZ_DXDYDZ_PHI.CLASS], tf.uint8 - ) - ground_truth["ground_truth_difficulty"] = tf.cast( - gt_boxes[:, CENTER_XYZ_DXDYDZ_PHI.CLASS + 1], tf.uint8 - ) + ground_truth = { + "ground_truth_frame_id": tf.boolean_mask( + tf.repeat(frame_ids, boxes_per_gt_frame), gt_real_boxes + ), + "ground_truth_bbox": gt_boxes[:, : CENTER_XYZ_DXDYDZ_PHI.PHI + 1], + "ground_truth_type": tf.cast( + gt_boxes[:, CENTER_XYZ_DXDYDZ_PHI.CLASS], tf.uint8 + ), + "ground_truth_difficulty": tf.cast( + gt_boxes[:, CENTER_XYZ_DXDYDZ_PHI.CLASS + 1], tf.uint8 + ), + } boxes_per_pred_frame = model_outputs["boxes"].shape[1] total_predicted_boxes = boxes_per_pred_frame * num_frames @@ -135,22 +135,23 @@ def flatten_target(boxes): prediction_scores = tf.reshape( model_outputs["confidence"], (total_predicted_boxes, 1) ) - # Remove boxes with class of -1 (these are non-boxes that may come from padding) + # Remove boxes with class of -1 (these are non-boxes that may come from + # padding) pred_real_boxes = tf.reduce_all(predicted_classes != -1, axis=[-1]) predicted_boxes = tf.boolean_mask(predicted_boxes, pred_real_boxes) predicted_classes = tf.boolean_mask(predicted_classes, pred_real_boxes) prediction_scores = tf.boolean_mask(prediction_scores, pred_real_boxes) - predictions = {} - - predictions["prediction_frame_id"] = tf.boolean_mask( - tf.repeat(frame_ids, boxes_per_pred_frame), pred_real_boxes - ) - predictions["prediction_bbox"] = predicted_boxes - predictions["prediction_type"] = tf.squeeze(predicted_classes) - predictions["prediction_score"] = tf.squeeze(prediction_scores) - predictions["prediction_overlap_nlz"] = tf.cast( - tf.zeros(predicted_boxes.shape[0]), tf.bool - ) + predictions = { + "prediction_frame_id": tf.boolean_mask( + tf.repeat(frame_ids, boxes_per_pred_frame), pred_real_boxes + ), + "prediction_bbox": predicted_boxes, + "prediction_type": tf.squeeze(predicted_classes), + "prediction_score": tf.squeeze(prediction_scores), + "prediction_overlap_nlz": tf.cast( + tf.zeros(predicted_boxes.shape[0]), tf.bool + ), + } return ground_truth, predictions diff --git a/keras_cv/core/factor_sampler/constant_factor_sampler.py b/keras_cv/core/factor_sampler/constant_factor_sampler.py index 86dce3cf83..af8e63f638 100644 --- a/keras_cv/core/factor_sampler/constant_factor_sampler.py +++ b/keras_cv/core/factor_sampler/constant_factor_sampler.py @@ -20,10 +20,11 @@ @keras.utils.register_keras_serializable(package="keras_cv") class ConstantFactorSampler(FactorSampler): - """ConstantFactorSampler samples the same factor for every call to `__call__()`. + """ConstantFactorSampler samples the same factor for every call to + `__call__()`. - This is useful in cases where a user wants to always ensure that an augmentation - layer performs augmentations of the same strength. + This is useful in cases where a user wants to always ensure that an + augmentation layer performs augmentations of the same strength. Args: value: the value to return from `__call__()`. diff --git a/keras_cv/core/factor_sampler/factor_sampler.py b/keras_cv/core/factor_sampler/factor_sampler.py index fd51e95c13..97a7ea919b 100644 --- a/keras_cv/core/factor_sampler/factor_sampler.py +++ b/keras_cv/core/factor_sampler/factor_sampler.py @@ -17,12 +17,14 @@ @keras.utils.register_keras_serializable(package="keras_cv") class FactorSampler: - """FactorSampler represents a strength factor for use in an augmentation layer. + """FactorSampler represents a strength factor for use in an augmentation + layer. - FactorSampler should be subclassed and implement a `__call__()` method that returns - a tf.float32, or a float. This method will be used by preprocessing layers to - determine the strength of their augmentation. The specific range of values - supported may vary by layer, but for most layers is the range [0, 1]. + FactorSampler should be subclassed and implement a `__call__()` method that + returns a tf.float32, or a float. This method will be used by preprocessing + layers to determine the strength of their augmentation. The specific range + of values supported may vary by layer, but for most layers is the range + [0, 1]. """ def __call__(self, shape=None, dtype="float32"): diff --git a/keras_cv/core/factor_sampler/normal_factor_sampler.py b/keras_cv/core/factor_sampler/normal_factor_sampler.py index 36ff82afdf..44363be73f 100644 --- a/keras_cv/core/factor_sampler/normal_factor_sampler.py +++ b/keras_cv/core/factor_sampler/normal_factor_sampler.py @@ -22,8 +22,8 @@ class NormalFactorSampler(FactorSampler): """NormalFactorSampler samples factors from a normal distribution. - This is useful in cases where a user wants to always ensure that an augmentation - layer performs augmentations of the same strength. + This is useful in cases where a user wants to always ensure that an + augmentation layer performs augmentations of the same strength. Args: mean: mean value for the distribution. @@ -40,8 +40,8 @@ class NormalFactorSampler(FactorSampler): upper=1 ) random_sharpness = keras_cv.layers.RandomSharpness(factor=factor) - # random_sharpness will now sample normally around 0.5, with a lower of 0 and upper - # bound of 1. + # random_sharpness will now sample normally around 0.5, with a lower of 0 + # and upper bound of 1. ``` """ diff --git a/keras_cv/core/factor_sampler/uniform_factor_sampler.py b/keras_cv/core/factor_sampler/uniform_factor_sampler.py index f9ec9b7f1a..0c5705312c 100644 --- a/keras_cv/core/factor_sampler/uniform_factor_sampler.py +++ b/keras_cv/core/factor_sampler/uniform_factor_sampler.py @@ -22,14 +22,15 @@ class UniformFactorSampler(FactorSampler): """UniformFactorSampler samples factors uniformly from a range. - This is useful in cases where a user wants to always ensure that an augmentation - layer performs augmentations of the same strength. + This is useful in cases where a user wants to always ensure that an + augmentation layer performs augmentations of the same strength. Args: lower: the lower bound of values returned from `__call__()`. upper: the upper bound of values returned from `__call__()`. - seed: A shape int or Tensor, the seed to the random number generator. Must have - dtype int32 or int64. (When using XLA, only int32 is allowed.) + seed: A shape int or Tensor, the seed to the random number generator. + Must have dtype int32 or int64. (When using XLA, only int32 is + allowed.) Usage: ```python uniform_factor = keras_cv.UniformFactorSampler(0, 0.5) diff --git a/keras_cv/custom_ops/box_util.cc b/keras_cv/custom_ops/box_util.cc index a0d5a63775..2b91f9c27a 100644 --- a/keras_cv/custom_ops/box_util.cc +++ b/keras_cv/custom_ops/box_util.cc @@ -146,9 +146,9 @@ RotatedBox2D::RotatedBox2D(const double cx, const double cy, const double w, const double h, const double heading) : cx_(cx), cy_(cy), w_(w), h_(h), heading_(heading) { // Compute loose bounds on dimensions of box that doesn't require computing - // full intersection. We can do this by trying to compute the largest circle - // swept by rotating the box around its center. The radius of that circle - // is the length of the ray from the center to the box corner. The upper + // full intersection. We can do this by trying to compute the largest circle + // swept by rotating the box around its center. The radius of that circle + // is the length of the ray from the center to the box corner. The upper // bound for this value is the length of the longer dimension divided by two // and then multiplied by root(2) (worst-case being a square box); we choose // 1.5 as slightly higher than root(2), and then use these extrema to do @@ -240,7 +240,7 @@ bool RotatedBox2D::MaybeIntersects(const RotatedBox2D& other) const { double RotatedBox2D::Intersection(const RotatedBox2D& other) const { // Do a fast intersection check - if the boxes are not near each other - // then we can return early. If they are close enough to maybe overlap, + // then we can return early. If they are close enough to maybe overlap, // we do the full check. if (!MaybeIntersects(other)) { return 0.0; @@ -417,7 +417,7 @@ bool Upright3DBox::WithinBox3D(const Vertex& point) const { } double Upright3DBox::IoU(const Upright3DBox& other) const { - // Check that both boxes are non-zero and valid. Otherwise, + // Check that both boxes are non-zero and valid. Otherwise, // return 0. if (!NonZeroAndValid() || !other.NonZeroAndValid()) { return 0; @@ -443,7 +443,7 @@ double Upright3DBox::IoU(const Upright3DBox& other) const { } double Upright3DBox::Overlap(const Upright3DBox& other) const { - // Check that both boxes are non-zero and valid. Otherwise, + // Check that both boxes are non-zero and valid. Otherwise, // return 0. if (!NonZeroAndValid() || !other.NonZeroAndValid()) { return 0; diff --git a/keras_cv/custom_ops/box_util.h b/keras_cv/custom_ops/box_util.h index 647a46e835..c4c862acdc 100644 --- a/keras_cv/custom_ops/box_util.h +++ b/keras_cv/custom_ops/box_util.h @@ -80,7 +80,7 @@ class RotatedBox2D { // Returns true if this box and 'other' might intersect. // - // If this returns false, the two boxes definitely do not intersect. If this + // If this returns false, the two boxes definitely do not intersect. If this // returns true, it is still possible that the two boxes do not intersect, and // the more expensive intersection code will be called. bool MaybeIntersects(const RotatedBox2D& other) const; @@ -101,13 +101,13 @@ class RotatedBox2D { // dimension. bool extreme_box_dim_ = false; - // The following fields are computed on demand. They are logically + // The following fields are computed on demand. They are logically // const. - // Cached area. Access via Area() public API. + // Cached area. Access via Area() public API. mutable double area_ = -1; - // Stores the vertices of the box. Access via box_vertices(). + // Stores the vertices of the box. Access via box_vertices(). mutable std::vector box_vertices_; }; diff --git a/keras_cv/datasets/imagenet/load.py b/keras_cv/datasets/imagenet/load.py index bda41c9d38..6f64620b6a 100644 --- a/keras_cv/datasets/imagenet/load.py +++ b/keras_cv/datasets/imagenet/load.py @@ -74,23 +74,24 @@ def load( ``` Args: - split: the split to load. Should be one of "train" or "validation." + split: the split to load. Should be one of "train" or "validation." tfrecord_path: the path to your preprocessed ImageNet TFRecords. - See keras_cv/datasets/imagenet/README.md for preprocessing instructions. + See keras_cv/datasets/imagenet/README.md for preprocessing + instructions. batch_size: how many instances to include in batches after loading. Should only be specified if img_size is specified (so that images can be resized to the same size before batching). - shuffle: whether or not to shuffle the dataset. Defaults to True. + shuffle: whether to shuffle the dataset, defaults to True. shuffle_buffer: the size of the buffer to use in shuffling. - reshuffle_each_iteration: whether to reshuffle the dataset on every epoch. - Defaults to False. - img_size: the size to resize the images to. Defaults to None, indicating + reshuffle_each_iteration: whether to reshuffle the dataset on every + epoch, defaults to False. + img_size: the size to resize the images to, defaults to None, indicating that images should not be resized. Returns: - tf.data.Dataset containing ImageNet. Each entry is a dictionary containing - keys {"image": image, "label": label} where images is a Tensor of shape - [H, W, 3] and label is a Tensor of shape [1000]. + tf.data.Dataset containing ImageNet. Each entry is a dictionary + containing keys {"image": image, "label": label} where images is a + Tensor of shape [H, W, 3] and label is a Tensor of shape [1000]. """ if batch_size is not None and img_size is None: @@ -116,8 +117,8 @@ def load( if shuffle: if not batch_size and not shuffle_buffer: raise ValueError( - "If `shuffle=True`, either a `batch_size` or `shuffle_buffer` must be " - "provided to `keras_cv.datasets.imagenet.load().`" + "If `shuffle=True`, either a `batch_size` or `shuffle_buffer` " + "must be provided to `keras_cv.datasets.imagenet.load().`" ) shuffle_buffer = shuffle_buffer or 8 * batch_size dataset = dataset.shuffle( diff --git a/keras_cv/datasets/pascal_voc/load.py b/keras_cv/datasets/pascal_voc/load.py index 9d468cdfcc..f3614d75f8 100644 --- a/keras_cv/datasets/pascal_voc/load.py +++ b/keras_cv/datasets/pascal_voc/load.py @@ -58,25 +58,25 @@ def load( ``` Args: - split: the split string passed to the `tensorflow_datasets.load()` call. Should - be one of "train", "test", or "validation." - bounding_box_format: the keras_cv bounding box format to load the boxes into. - For a list of supported formats, please Refer + split: the split string passed to the `tensorflow_datasets.load()` call. + Should be one of "train", "test", or "validation." + bounding_box_format: the keras_cv bounding box format to load the boxes + into. For a list of supported formats, please refer [to the keras.io docs](https://keras.io/api/keras_cv/bounding_box/formats/) for more details on supported bounding box formats. batch_size: how many instances to include in batches after loading - shuffle: whether or not to shuffle the dataset. Defaults to True. shuffle_buffer: the size of the buffer to use in shuffling. - shuffle_files: (Optional) whether or not to shuffle files, defaults to True. - dataset: (Optional) the PascalVOC dataset to load from. Should be either - 'voc/2007' or 'voc/2012'. Defaults to 'voc/2007'. + shuffle_files: (Optional) whether to shuffle files, defaults to + True. + dataset: (Optional) the PascalVOC dataset to load from. Should be either + 'voc/2007' or 'voc/2012', defaults to 'voc/2007'. Returns: - tf.data.Dataset containing PascalVOC. Each entry is a dictionary containing - keys {"images": images, "bounding_boxes": bounding_boxes} where images is a - Tensor of shape [batch, H, W, 3] and bounding_boxes is a `tf.RaggedTensor` of - shape [batch, None, 5]. - """ + tf.data.Dataset containing PascalVOC. Each entry is a dictionary + containing keys {"images": images, "bounding_boxes": bounding_boxes} + where images is a Tensor of shape [batch, H, W, 3] and bounding_boxes is + a `tf.RaggedTensor` of shape [batch, None, 5]. + """ # noqa: E501 if dataset not in ["voc/2007", "voc/2012"]: raise ValueError( "keras_cv.datasets.pascal_voc.load() expects the `dataset` " diff --git a/keras_cv/datasets/pascal_voc/segmentation.py b/keras_cv/datasets/pascal_voc/segmentation.py index fd0e756280..776c235c80 100644 --- a/keras_cv/datasets/pascal_voc/segmentation.py +++ b/keras_cv/datasets/pascal_voc/segmentation.py @@ -14,20 +14,22 @@ """Data loader for Pascal VOC 2012 segmentation dataset. -The image classification and object detection (bounding box) data is covered by existing -TF datasets in https://www.tensorflow.org/datasets/catalog/voc. The segmentation data ( -both class segmentation and instance segmentation) are included in the VOC 2012, but not -offered by TF-DS yet. This module is trying to fill this gap while TFDS team can -address this feature (b/252870855, https://github.com/tensorflow/datasets/issues/27 and +The image classification and object detection (bounding box) data is covered by +existing TF datasets in https://www.tensorflow.org/datasets/catalog/voc. The +segmentation data (both class segmentation and instance segmentation) are +included in the VOC 2012, but not offered by TF-DS yet. This module is trying to +fill this gap while TFDS team can address this feature (b/252870855, +https://github.com/tensorflow/datasets/issues/27 and https://github.com/tensorflow/datasets/pull/1198). -The schema design is similar to the existing design of TFDS, but trimmed to fit the need -of Keras CV models. +The schema design is similar to the existing design of TFDS, but trimmed to fit +the need of Keras CV models. This module contains following functionalities: 1. Download and unpack original data from Pascal VOC. -2. Reprocess and build up dataset that include image, class label, object bounding boxes, +2. Reprocess and build up dataset that include image, class label, object + bounding boxes, class and instance segmentation masks. 3. Produce tfrecords from the dataset. 4. Load existing tfrecords from result in 3. @@ -45,9 +47,7 @@ class and instance segmentation masks. import tensorflow_datasets as tfds from tensorflow import keras -VOC_URL = ( - "http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar" -) +VOC_URL = "https://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar" # noqa: E501 """ @InProceedings{{BharathICCV2011, @@ -55,13 +55,14 @@ class and instance segmentation masks. title = "Semantic Contours from Inverse Detectors", booktitle = "International Conference on Computer Vision (ICCV)", year = "2011"}} -""" -SBD_URL = "http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz" +""" # noqa: E501 +SBD_URL = "https://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz" # noqa: E501 -# Note that this list doesn't contain the background class. In the classification use -# case, the label is 0 based (aeroplane -> 0), whereas in segmentation use case, the 0 is -# reserved for background, so aeroplane maps to 1. +# Note that this list doesn't contain the background class. In the +# classification use case, the label is 0 based (aeroplane -> 0), whereas in +# segmentation use case, the 0 is reserved for background, so aeroplane maps to +# 1. CLASSES = [ "aeroplane", "bicycle", @@ -87,11 +88,11 @@ class and instance segmentation masks. # This is used to map between string class to index. CLASS_TO_INDEX = {name: index for index, name in enumerate(CLASSES)} -# For the mask data in the PNG file, the encoded raw pixel value need be to converted -# to the proper class index. In the following map, [0, 0, 0] will be convert to 0, and -# [128, 0, 0] will be conveted to 1, so on so forth. Also note that the mask class is 1 -# base since class 0 is reserved for the background. The [128, 0, 0] (class 1) is mapped -# to `aeroplane`. +# For the mask data in the PNG file, the encoded raw pixel value need to be +# converted to the proper class index. In the following map, [0, 0, 0] will be +# convert to 0, and [128, 0, 0] will be converted to 1, so on so forth. Also +# note that the mask class is 1 base since class 0 is reserved for the +# background. The [128, 0, 0] (class 1) is mapped to `aeroplane`. VOC_PNG_COLOR_VALUE = [ [0, 0, 0], [128, 0, 0], @@ -140,7 +141,8 @@ def _download_data_file( """Fetch the original VOC or Semantic Boundaries Dataset from remote URL. Args: - data_url: string, the URL for the data to be downloaded, should be in a zipped tar package. + data_url: string, the URL for the data to be downloaded, should be in a + zipped tar package. local_dir_path: string, the local directory path to save the data. Returns: the path to the folder of extracted data. @@ -171,7 +173,8 @@ def _download_data_file( def _parse_annotation_data(annotation_file_path): """Parse the annotation XML file for the image. - The annotation contains the metadata, as well as the object bounding box information. + The annotation contains the metadata, as well as the object bounding box + information. """ with tf.io.gfile.GFile(annotation_file_path, "r") as f: @@ -340,13 +343,16 @@ def _build_sbd_metadata(data_dir, image_ids): return result -# With jit_compile=True, there will be 0.4 sec compilation overhead, but save about 0.2 -# sec per 1000 images. See https://github.com/keras-team/keras-cv/pull/943#discussion_r1001092882 +# With jit_compile=True, there will be 0.4 sec compilation overhead, but save +# about 0.2 sec per 1000 images. See +# https://github.com/keras-team/keras-cv/pull/943#discussion_r1001092882 # for more details. @tf.function(jit_compile=True) def _decode_png_mask(mask): - """Decode the raw PNG image and convert it to 2D tensor with probably class.""" - # Cast the mask to int32 since the original uint8 will overflow when multiple with 256 + """Decode the raw PNG image and convert it to 2D tensor with probably + class.""" + # Cast the mask to int32 since the original uint8 will overflow when + # multiplied with 256 mask = tf.cast(mask, tf.int32) mask = mask[:, :, 0] * 256 * 256 + mask[:, :, 1] * 256 + mask[:, :, 2] mask = tf.expand_dims(tf.gather(VOC_PNG_COLOR_MAPPING, mask), -1) @@ -463,21 +469,23 @@ def load( ): """Load the Pacal VOC 2012 dataset. - This function will download the data tar file from remote if needed, and untar to - the local `data_dir`, and build dataset from it. + This function will download the data tar file from remote if needed, and + untar to the local `data_dir`, and build dataset from it. It supports both VOC2012 and Semantic Boundaries Dataset (SBD). - The returned segmentation masks will be int ranging from [0, num_classes), as well as - 255 which is the boundary mask. + The returned segmentation masks will be int ranging from [0, num_classes), + as well as 255 which is the boundary mask. Args: - split: string, can be 'train', 'eval', 'trainval", 'sbd_train', or 'sbd_eval'. - 'sbd_train' represents the training dataset for SBD dataset, while 'train' represents - the training dataset for VOC2012 dataset. Default to `sbd_train`. - data_dir: string, local directory path for the loaded data. This will be used to - download the data file, and unzip. It will be used as a cach directory. - Default to None, and `~/.keras/pascal_voc_2012` will be used. + split: string, can be 'train', 'eval', 'trainval", 'sbd_train', or + 'sbd_eval'. 'sbd_train' represents the training dataset for SBD + dataset, while 'train' represents the training dataset for VOC2012 + dataset. Defaults to `sbd_train`. + data_dir: string, local directory path for the loaded data. This will be + used to download the data file, and unzip. It will be used as a + cache directory. Defaults to None, and `~/.keras/pascal_voc_2012` + will be used. """ supported_split_value = [ "train", diff --git a/keras_cv/datasets/pascal_voc/segmentation_test.py b/keras_cv/datasets/pascal_voc/segmentation_test.py index 5c40f3bc87..53c3fa5056 100644 --- a/keras_cv/datasets/pascal_voc/segmentation_test.py +++ b/keras_cv/datasets/pascal_voc/segmentation_test.py @@ -27,8 +27,8 @@ class PascalVocSegmentationDataTest(tf.test.TestCase): def setUp(self): super().setUp() self.tempdir = self.get_tempdir() - # Note that this will not work with bazel, need to be rewrite into relying on - # FLAGS.test_srcdir + # Note that this will not work with bazel, need to be rewritten into + # relying on FLAGS.test_srcdir self.test_data_tar_path = os.path.abspath( os.path.join( os.path.abspath(__file__), @@ -47,8 +47,8 @@ def get_tempdir(self): return self.create_tempdir().full_path def test_download_data(self): - # Since the original data package is too large, we use a small package as a - # replacement. + # Since the original data package is too large, we use a small package + # as a replacement. local_data_dir = os.path.join(self.tempdir, "pascal_voc_2012/") test_data_dir = segmentation._download_data_file( data_url=pathlib.Path(self.test_data_tar_path).as_uri(), @@ -57,7 +57,8 @@ def test_download_data(self): ) self.assertTrue(os.path.exists(test_data_dir)) - # Make sure the data is unzipped correctly and populated with correct content + # Make sure the data is unzipped correctly and populated with correct + # content. expected_subdirs = [ "Annotations", "ImageSets", @@ -78,8 +79,8 @@ def test_skip_download_and_override(self): local_dir_path=local_data_dir, ) - # Touch a file in the test_data_dir and make sure it exists (not being override) - # when invoke the _download_data_file again + # Touch a file in the test_data_dir and make sure it exists (not being + # overridden) when invoking the _download_data_file again os.makedirs(os.path.join(test_data_dir, "Annotations", "dummy_dir")) segmentation._download_data_file( data_url=pathlib.Path(self.test_data_tar_path).as_uri(), @@ -182,8 +183,8 @@ def test_decode_png_mask(self): self.assertEquals( tf.reduce_min(mask), 0 ) # The 0 value is for the background - # The mask contains two classes, 1 and 15, see the label section in the previous - # test case. + # The mask contains two classes, 1 and 15, see the label section in the + # previous test case. self.assertEquals( tf.reduce_sum(tf.cast(tf.equal(mask, 1), tf.int32)), 4734 ) @@ -321,8 +322,8 @@ def test_build_dataset(self): self.assertEquals( tf.reduce_min(png), 0 ) # The 0 value is for the background - # The mask contains two classes, 1 and 15, see the label section in the previous - # test case. + # The mask contains two classes, 1 and 15, see the label section in the + # previous test case. self.assertEquals( tf.reduce_sum(tf.cast(tf.equal(png, 1), tf.int32)), 4734 ) diff --git a/keras_cv/datasets/waymo/load.py b/keras_cv/datasets/waymo/load.py index cea58285d6..c9729bd514 100644 --- a/keras_cv/datasets/waymo/load.py +++ b/keras_cv/datasets/waymo/load.py @@ -48,11 +48,11 @@ def load( tfrecords in the Waymo Open Dataset, or a list of strings pointing to the tfrecords themselves transformer: a Python function which transforms a Waymo Open Dataset - Frame object into tensors. Default to convert range image to point + Frame object into tensors, defaults to convert range image to point cloud. output_signature: the type specification of the tensors created by the transformer. This is often a dictionary from feature column names to - tf.TypeSpecs. Default to point cloud representations of Waymo Open + tf.TypeSpecs, defaults to point cloud representations of Waymo Open Dataset data. Returns: diff --git a/keras_cv/datasets/waymo/transformer.py b/keras_cv/datasets/waymo/transformer.py index 152574d33d..c631cecc37 100644 --- a/keras_cv/datasets/waymo/transformer.py +++ b/keras_cv/datasets/waymo/transformer.py @@ -57,8 +57,8 @@ "label_point_nlz": tf.TensorSpec([None], tf.int32), } -# Maximum number of points from all lidars excluding the top lidar. -# Please refer to https://arxiv.org/pdf/1912.04838.pdf Figure 1 for sensor layouts. +# Maximum number of points from all lidars excluding the top lidar. Please refer +# to https://arxiv.org/pdf/1912.04838.pdf Figure 1 for sensor layouts. _MAX_NUM_NON_TOP_LIDAR_POINTS = 30000 @@ -265,16 +265,16 @@ def _get_point_lidar( frame, max_num_points: int, ) -> struct.PointTensors: - """Gets point related tensors for non top lidar. + """Gets point related tensors for non-top lidar. - The main differences from top lidar extraction are related to second return and - point down sampling. + The main differences from top lidar extraction are related to second return + and point down sampling. Args: ris: Mapping from lidar ID to range image tensor. The ri format is [range, intensity, elongation, is_in_nlz]. frame: a Waymo Open Dataset frame. - max_num_points: maximum number of points from non top lidar. + max_num_points: maximum number of points from non-top lidar. Returns: Point related tensors. @@ -351,7 +351,7 @@ def _get_point(frame, max_num_lidar_points: int) -> struct.PointTensors: Args: frame: a Waymo Open Dataset frame. - max_num_lidar_points: maximum number of points from non top lidars. + max_num_lidar_points: maximum number of points from non-top lidars. Returns: Point related tensors. @@ -612,8 +612,9 @@ def _box_3d_global_to_vehicle( def build_tensors_from_wod_frame(frame) -> Dict[str, tf.Tensor]: """Builds tensors from a Waymo Open Dataset frame. - This function is to convert range image to point cloud. User can also work with - range image directly with frame_utils functions from waymo_open_dataset. + This function is to convert range image to point cloud. User can also work + with range image directly with frame_utils functions from + waymo_open_dataset. Args: frame: a Waymo Open Dataset frame. @@ -659,13 +660,13 @@ def build_tensors_from_wod_frame(frame) -> Dict[str, tf.Tensor]: "point_xyz": point_tensors.point_xyz, "point_feature": point_tensors.point_feature, "point_mask": tf.ones([num_points], dtype=tf.bool), - "point_range_image_row_col_sensor_id": point_tensors.point_range_image_row_col_sensor_id, + "point_range_image_row_col_sensor_id": point_tensors.point_range_image_row_col_sensor_id, # noqa: E501 "label_box": point_label_tensors.label_box, "label_box_id": point_label_tensors.label_box_id, "label_box_meta": point_label_tensors.label_box_meta, "label_box_class": point_label_tensors.label_box_class, "label_box_density": point_label_tensors.label_box_density, - "label_box_detection_difficulty": point_label_tensors.label_box_detection_difficulty, + "label_box_detection_difficulty": point_label_tensors.label_box_detection_difficulty, # noqa: E501 "label_box_mask": point_label_tensors.label_box_mask, "label_point_class": point_label_tensors.label_point_class, "label_point_nlz": point_tensors.label_point_nlz, @@ -722,10 +723,12 @@ def _pad_fn(t: tf.Tensor, max_counts: int) -> tf.Tensor: def transform_to_vehicle_frame( frame: Dict[str, tf.Tensor] ) -> Dict[str, tf.Tensor]: - """Transform tensors in a frame from global coordinates to vehicle coordinates. + """Transform tensors in a frame from global coordinates to vehicle + coordinates. Args: - frame: a dictionary of feature tensors from a Waymo Open Dataset frame in global frame. + frame: a dictionary of feature tensors from a Waymo Open Dataset frame in + global frame. Returns: diff --git a/keras_cv/keypoint/converters.py b/keras_cv/keypoint/converters.py index df4ad45101..ecde73c5b9 100644 --- a/keras_cv/keypoint/converters.py +++ b/keras_cv/keypoint/converters.py @@ -63,16 +63,16 @@ def convert_format(keypoints, source, target, images=None, dtype=None): Supported formats are: - `"xy"`, absolute pixel positions. - - `"rel_xyxy"`. relative pixel positions. + - `"rel_xyxy"`. relative pixel positions. - Formats are case insensitive. It is recommended that you + Formats are case-insensitive. It is recommended that you capitalize width and height to maximize the visual difference between `"xyWH"` and `"xyxy"`. Relative formats, abbreviated `rel`, make use of the shapes of the - `images` passsed. In these formats, the coordinates, widths, and + `images` passed. In these formats, the coordinates, widths, and heights are all specified as percentages of the host image. - `images` may be a ragged Tensor. Note that using a ragged Tensor + `images` may be a ragged Tensor. Note that using a ragged Tensor for images may cause a substantial performance loss, as each image will need to be processed separately due to the mismatching image shapes. @@ -93,18 +93,18 @@ def convert_format(keypoints, source, target, images=None, dtype=None): keypoints: tf.Tensor or tf.RaggedTensor representing keypoints in the format specified in the `source` parameter. `keypoints` can optionally have extra dimensions stacked - on the final axis to store metadata. keypoints should + on the final axis to store metadata. keypoints should have a rank between 2 and 4, with the shape `[num_boxes,*]`, `[batch_size, num_boxes, *]` or `[batch_size, num_groups, num_keypoints,*]`. source: One of {" ".join([f'"{f}"' for f in - TO_XY_CONVERTERS.keys()])}. Used to specify the original + TO_XY_CONVERTERS.keys()])}. Used to specify the original format of the `boxes` parameter. target: One of {" ".join([f'"{f}"' for f in - TO_XY_CONVERTERS.keys()])}. Used to specify the + TO_XY_CONVERTERS.keys()])}. Used to specify the destination format of the `boxes` parameter. images: (Optional) a batch of images aligned with `boxes` on - the first axis. Should be rank 3 (`HWC` format) or 4 + the first axis. Should be rank 3 (`HWC` format) or 4 (`BHWC` format). Used in some converters to compute relative pixel values of the bounding box dimensions. Required when transforming from a rel format to a non-rel @@ -169,8 +169,9 @@ def _format_inputs(keypoints, images): images_include_batch = images_rank == 4 if keypoints_includes_batch != images_include_batch: raise ValueError( - "convert_format() expects both `keypoints` and `images` to be batched " - f"or both unbatched. Received len(keypoints.shape)={keypoints_rank}, " + "convert_format() expects both `keypoints` and `images` to be " + "batched or both unbatched. Received " + f"len(keypoints.shape)={keypoints_rank}, " f"len(images.shape)={images_rank}. Expected either " "len(keypoints.shape)=2 and len(images.shape)=3, or " "len(keypoints.shape)>=3 and len(images.shape)=4." diff --git a/keras_cv/keypoint/converters_test.py b/keras_cv/keypoint/converters_test.py index 87c6407c1e..30b5e130f7 100644 --- a/keras_cv/keypoint/converters_test.py +++ b/keras_cv/keypoint/converters_test.py @@ -125,7 +125,8 @@ def test_raise_errors_when_missing_shape(self): "keypoint_rank", tf.ones([2, 3, 4, 2, 1]), None, - "Expected keypoints rank to be in [2, 4], got len(keypoints.shape)=5.", + "Expected keypoints rank to be in [2, 4], got " + "len(keypoints.shape)=5.", ), ( "images_rank", @@ -137,10 +138,11 @@ def test_raise_errors_when_missing_shape(self): "batch_mismatch", tf.ones([2, 4, 2]), tf.ones([35, 35, 3]), - "convert_format() expects both `keypoints` and `images` to be batched or " - "both unbatched. Received len(keypoints.shape)=3, len(images.shape)=3. " - "Expected either len(keypoints.shape)=2 and len(images.shape)=3, or " - "len(keypoints.shape)>=3 and len(images.shape)=4.", + "convert_format() expects both `keypoints` and `images` to be " + "batched or both unbatched. Received len(keypoints.shape)=3, " + "len(images.shape)=3. Expected either len(keypoints.shape)=2 and " + "len(images.shape)=3, or len(keypoints.shape)>=3 and " + "len(images.shape)=4.", ), ) def test_input_format_exception(self, keypoints, images, expected): diff --git a/keras_cv/keypoint/formats.py b/keras_cv/keypoint/formats.py index ee94d26494..d317087e1e 100644 --- a/keras_cv/keypoint/formats.py +++ b/keras_cv/keypoint/formats.py @@ -43,7 +43,7 @@ class REL_XY: REL_XY is like XY, but each value is relative to the width and height of the - origin image. Values are percentages of the origin images' width and height + origin image. Values are percentages of the origin images' width and height respectively. The REL_XY format consists of the following required indices: diff --git a/keras_cv/layers/__init__.py b/keras_cv/layers/__init__.py index 5fd23d4878..a8bd590094 100644 --- a/keras_cv/layers/__init__.py +++ b/keras_cv/layers/__init__.py @@ -82,13 +82,13 @@ from keras_cv.layers.preprocessing.rescaling import Rescaling from keras_cv.layers.preprocessing.resizing import Resizing from keras_cv.layers.preprocessing.solarization import Solarization -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) from keras_cv.layers.preprocessing_3d.frustum_random_dropping_points import ( FrustumRandomDroppingPoints, ) -from keras_cv.layers.preprocessing_3d.frustum_random_point_feature_noise import ( +from keras_cv.layers.preprocessing_3d.frustum_random_point_feature_noise import ( # noqa: E501 FrustumRandomPointFeatureNoise, ) from keras_cv.layers.preprocessing_3d.global_random_dropping_points import ( diff --git a/keras_cv/layers/feature_pyramid.py b/keras_cv/layers/feature_pyramid.py index fcf1661a01..f0be0ce3fd 100644 --- a/keras_cv/layers/feature_pyramid.py +++ b/keras_cv/layers/feature_pyramid.py @@ -21,33 +21,33 @@ class FeaturePyramid(keras.layers.Layer): """Implements a Feature Pyramid Network. This implements the paper: - Tsung-Yi Lin, Piotr Dollar, Ross Girshick, Kaiming He, Bharath Hariharan, and - Serge Belongie. - Feature Pyramid Networks for Object Detection. + Tsung-Yi Lin, Piotr Dollar, Ross Girshick, Kaiming He, Bharath Hariharan, + and Serge Belongie. Feature Pyramid Networks for Object Detection. (https://arxiv.org/pdf/1612.03144) Feature Pyramid Networks (FPNs) are basic components that are added to an - existing feature extractor (CNN) to combine features at different scales. For the - basic FPN, the inputs are features `Ci` from different levels of a CNN, which is - usually the last block for each level, where the feature is scaled from the image - by a factor of `1/2^i`. + existing feature extractor (CNN) to combine features at different scales. + For the basic FPN, the inputs are features `Ci` from different levels of a + CNN, which is usually the last block for each level, where the feature is + scaled from the image by a factor of `1/2^i`. - There is an output associated with each level in the basic FPN. The output Pi - at level `i` (corresponding to Ci) is given by performing a merge operation on - the outputs of: + There is an output associated with each level in the basic FPN. The output + Pi at level `i` (corresponding to Ci) is given by performing a merge + operation on the outputs of: - 1) a lateral operation on Ci (usually a conv2D layer with kernel = 1 and strides = 1) + 1) a lateral operation on Ci (usually a conv2D layer with kernel = 1 and + strides = 1) 2) a top-down upsampling operation from Pi+1 (except for the top most level) The final output of each level will also have a conv2D operation - (usually with kernel = 3 and strides = 1). + (typically with kernel = 3 and strides = 1). The inputs to the layer should be a dict with int keys should match the - pyramid_levels, e.g. for `pyramid_levels` = [2,3,4,5], the expected input dict should - be `{2:c2, 3:c3, 4:c4, 5:c5}`. + pyramid_levels, e.g. for `pyramid_levels` = [2,3,4,5], the expected input + dict should be `{2:c2, 3:c3, 4:c4, 5:c5}`. - The output of the layer will have same structures as the inputs, a dict with int keys - and value for each of the level. + The output of the layer will have same structures as the inputs, a dict with + int keys and value for each of the level. Args: min_level: a python int for the lowest level of the pyramid for @@ -55,31 +55,41 @@ class FeaturePyramid(keras.layers.Layer): max_level: a python int for the highest level of the pyramid for feature extraction. num_channels: an integer representing the number of channels for the FPN - operations. Defaults to 256. - lateral_layers: a python dict with int keys that matches to each of the pyramid - level. The values of the dict should be `keras.Layer`, which will be called - with feature activation outputs from backbone at each level. Default to - None, and a `keras.Conv2D` layer with kernel 1x1 will be created for each - pyramid level. - output_layers: a python dict with int keys that matches to each of the pyramid - level. The values of the dict should be `keras.Layer`, which will be called - with feature inputs and merged result from upstream levels. Default to None, - and a `keras.Conv2D` layer with kernel 3x3 will be created for each pyramid - level. + operations, defaults to 256. + lateral_layers: a python dict with int keys that matches to each of the + pyramid level. The values of the dict should be `keras.Layer`, which + will be called with feature activation outputs from backbone at each + level. Defaults to None, and a `keras.Conv2D` layer with kernel 1x1 + will be created for each pyramid level. + output_layers: a python dict with int keys that matches to each of the + pyramid level. The values of the dict should be `keras.Layer`, which + will be called with feature inputs and merged result from upstream + levels. Defaults to None, and a `keras.Conv2D` layer with kernel 3x3 + will be created for each pyramid level. Sample Usage: ```python inp = keras.layers.Input((384, 384, 3)) - backbone = keras.applications.EfficientNetB0(input_tensor=inp, include_top=False) - layer_names = ['block2b_add', 'block3b_add', 'block5c_add', 'top_activation'] + backbone = keras.applications.EfficientNetB0( + input_tensor=inp, + include_top=False + ) + layer_names = ['block2b_add', + 'block3b_add', + 'block5c_add', + 'top_activation' + ] backbone_outputs = {} for i, layer_name in enumerate(layer_names): backbone_outputs[i+2] = backbone.get_layer(layer_name).output # output_dict is a dict with 2, 3, 4, 5 as keys - output_dict = keras_cv.layers.FeaturePyramid(min_level=2, max_level=5)(backbone_outputs) + output_dict = keras_cv.layers.FeaturePyramid( + min_level=2, + max_level=5 + )(backbone_outputs) ``` """ @@ -148,22 +158,23 @@ def _validate_user_layers(self, user_input, param_name): ) def call(self, features): - # Note that this assertion might not be true for all the subclasses. It is - # possible to have FPN that has high levels than the height of backbone outputs. + # Note that this assertion might not be true for all the subclasses. It + # is possible to have FPN that has high levels than the height of + # backbone outputs. if ( not isinstance(features, dict) or sorted(features.keys()) != self.pyramid_levels ): raise ValueError( - "FeaturePyramid expects input features to be a dict with int keys " - "that match the values provided in pyramid_levels. " + "FeaturePyramid expects input features to be a dict with int " + "keys that match the values provided in pyramid_levels. " f"Expect feature keys: {self.pyramid_levels}, got: {features}" ) return self.build_feature_pyramid(features) def build_feature_pyramid(self, input_features): - # To illustrate the connection/topology, the basic flow for a FPN with level - # 3, 4, 5 is like below: + # To illustrate the connection/topology, the basic flow for a FPN with + # level 3, 4, 5 is like below: # # input_l5 -> conv2d_1x1_l5 ----V---> conv2d_3x3_l5 -> output_l5 # V @@ -181,13 +192,14 @@ def build_feature_pyramid(self, input_features): for level in reversed_levels: output = self.lateral_layers[level](input_features[level]) if level < top_level: - # for the top most output, it doesn't need to merge with any upper stream - # outputs + # for the top most output, it doesn't need to merge with any + # upper stream outputs upstream_output = self.top_down_op(output_features[level + 1]) output = self.merge_op([output, upstream_output]) output_features[level] = output - # Post apply the output layers so that we don't leak them to the down stream level + # Post apply the output layers so that we don't leak them to the down + # stream level for level in reversed_levels: output_features[level] = self.output_layers[level]( output_features[level] diff --git a/keras_cv/layers/fusedmbconv.py b/keras_cv/layers/fusedmbconv.py index a281d09f7b..b03543f8c7 100644 --- a/keras_cv/layers/fusedmbconv.py +++ b/keras_cv/layers/fusedmbconv.py @@ -32,49 +32,59 @@ @keras.utils.register_keras_serializable(package="keras_cv") class FusedMBConvBlock(layers.Layer): """ - Implementation of the FusedMBConv block (Fused Mobile Inverted Residual Bottleneck) from: - (EfficientNet-EdgeTPU: Creating Accelerator-Optimized Neural Networks with AutoML)[https://ai.googleblog.com/2019/08/efficientnet-edgetpu-creating.html] - (EfficientNetV2: Smaller Models and Faster Training)[https://arxiv.org/abs/2104.00298v3]. + Implementation of the FusedMBConv block (Fused Mobile Inverted Residual + Bottleneck) from: + [EfficientNet-EdgeTPU: Creating Accelerator-Optimized Neural Networks with AutoML](https://ai.googleblog.com/2019/08/efficientnet-edgetpu-creating.html) + [EfficientNetV2: Smaller Models and Faster Training](https://arxiv.org/abs/2104.00298v3). - FusedMBConv blocks are based on MBConv blocks, and replace the depthwise and 1x1 output convolution - blocks with a single 3x3 convolution block, fusing them together - hence the name "FusedMBConv". - Alongside MBConv blocks, they can be used in mobile-oriented and efficient architectures, - and are present in architectures EfficientNet. + FusedMBConv blocks are based on MBConv blocks, and replace the depthwise and + 1x1 output convolution blocks with a single 3x3 convolution block, fusing + them together - hence the name "FusedMBConv". Alongside MBConv blocks, they + can be used in mobile-oriented and efficient architectures, and are present + in architectures EfficientNet. - FusedMBConv blocks follow a narrow-wide-narrow structure - expanding a 1x1 convolution, performing - Squeeze-Excitation and then applying a 3x3 convolution, which is a more efficient operation than - conventional wide-narrow-wide structures. + FusedMBConv blocks follow a narrow-wide-narrow structure - expanding a 1x1 + convolution, performing Squeeze-Excitation and then applying a 3x3 + convolution, which is a more efficient operation than conventional + wide-narrow-wide structures. - As they're frequently used for models to be deployed to edge devices, they're - implemented as a layer for ease of use and re-use. + As they're frequently used for models to be deployed to edge devices, + they're implemented as a layer for ease of use and re-use. Args: input_filters: int, the number of input filters output_filters: int, the number of output filters - expand_ratio: default 1, the ratio by which input_filters are multiplied to expand - the structure in the middle expansion phase - kernel_size: default 3, the kernel_size to apply to the expansion phase convolutions - strides: default 1, the strides to apply to the expansion phase convolutions - se_ratio: default 0.0, The filters used in the Squeeze-Excitation phase, and are chosen as - the maximum between 1 and input_filters*se_ratio + expand_ratio: default 1, the ratio by which input_filters are multiplied + to expand the structure in the middle expansion phase + kernel_size: default 3, the kernel_size to apply to the expansion phase + convolutions + strides: default 1, the strides to apply to the expansion phase + convolutions + se_ratio: default 0.0, The filters used in the Squeeze-Excitation phase, + and are chosen as the maximum between 1 and input_filters*se_ratio bn_momentum: default 0.9, the BatchNormalization momentum - activation: default "swish", the activation function used between convolution operations - survival_probability: float, default 0.8, the optional dropout rate to apply before the output - convolution + activation: default "swish", the activation function used between + convolution operations + survival_probability: float, the optional dropout rate to apply before + the output convolution, defaults to 0.8 Returns: - A `tf.Tensor` representing a feature map, passed through the FusedMBConv block + A `tf.Tensor` representing a feature map, passed through the FusedMBConv + block Example usage: ``` inputs = tf.random.normal(shape=(1, 64, 64, 32), dtype=tf.float32) - layer = keras_cv.layers.FusedMBConvBlock(input_filters=32, output_filters=32) + layer = keras_cv.layers.FusedMBConvBlock( + input_filters=32, + output_filters=32 + ) output = layer(inputs) output.shape # TensorShape([1, 224, 224, 48]) ``` - """ + """ # noqa: E501 def __init__( self, diff --git a/keras_cv/layers/fusedmbconv_test.py b/keras_cv/layers/fusedmbconv_test.py index 2785fe2012..6b3d85a0da 100644 --- a/keras_cv/layers/fusedmbconv_test.py +++ b/keras_cv/layers/fusedmbconv_test.py @@ -25,7 +25,8 @@ def cleanup_global_session(self): # Code before yield runs before the test tf.config.set_soft_device_placement(False) yield - # Reset soft device placement to not interfere with other unit test files + # Reset soft device placement to not interfere with other unit test + # files tf.config.set_soft_device_placement(True) keras.backend.clear_session() diff --git a/keras_cv/layers/mbconv.py b/keras_cv/layers/mbconv.py index 7a15954a41..c4991a101b 100644 --- a/keras_cv/layers/mbconv.py +++ b/keras_cv/layers/mbconv.py @@ -45,36 +45,45 @@ def __init__( **kwargs ): """ - Implementation of the MBConv block (Mobile Inverted Residual Bottleneck) from: - (MobileNetV2: Inverted Residuals and Linear Bottlenecks)[https://arxiv.org/abs/1801.04381v4]. + Implementation of the MBConv block (Mobile Inverted Residual Bottleneck) + from: + [MobileNetV2: Inverted Residuals and Linear Bottlenecks](https://arxiv.org/abs/1801.04381v4). - MBConv blocks are common blocks used in mobile-oriented and efficient architectures, - present in architectures such as MobileNet, EfficientNet, MaxViT, etc. + MBConv blocks are common blocks used in mobile-oriented and efficient + architectures, present in architectures such as MobileNet, EfficientNet, + MaxViT, etc. - MBConv blocks follow a narrow-wide-narrow structure - expanding a 1x1 convolution, applying - depthwise convolution, and narrowing back to a 1x1 convolution, which is a more efficient operation - than conventional wide-narrow-wide structures. + MBConv blocks follow a narrow-wide-narrow structure - expanding a 1x1 + convolution, applying depthwise convolution, and narrowing back to a 1x1 + convolution, which is a more efficient operation than conventional + wide-narrow-wide structures. - As they're frequently used for models to be deployed to edge devices, they're - implemented as a layer for ease of use and re-use. + As they're frequently used for models to be deployed to edge devices, + they're implemented as a layer for ease of use and re-use. Args: input_filters: int, the number of input filters - output_filters: int, the optional number of output filters after Squeeze-Excitation - expand_ratio: default 1, the ratio by which input_filters are multiplied to expand - the structure in the middle expansion phase - kernel_size: default 3, the kernel_size to apply to the expansion phase convolutions - strides: default 1, the strides to apply to the expansion phase convolutions - se_ratio: default 0.0, Squeeze-Excitation happens before depthwise convolution - and before output convolution only if the se_ratio is above 0. - The filters used in this phase are chosen as the maximum between 1 and input_filters*se_ratio + output_filters: int, the optional number of output filters after + Squeeze-Excitation + expand_ratio: default 1, the ratio by which input_filters are + multiplied to expand the structure in the middle expansion phase + kernel_size: default 3, the kernel_size to apply to the expansion + phase convolutions + strides: default 1, the strides to apply to the expansion phase + convolutions + se_ratio: default 0.0, Squeeze-Excitation happens before depthwise + convolution and before output convolution only if the se_ratio + is above 0. The filters used in this phase are chosen as the + maximum between 1 and input_filters*se_ratio bn_momentum: default 0.9, the BatchNormalization momentum - activation: default "swish", the activation function used between convolution operations - survival_probability: float, default 0.8, the optional dropout rate to apply before the output - convolution + activation: default "swish", the activation function used between + convolution operations + survival_probability: float, the optional dropout rate to apply + before the output convolution, defaults to 0.8 Returns: - A `tf.Tensor` representing a feature map, passed through the MBConv block + A `tf.Tensor` representing a feature map, passed through the MBConv + block Example usage: @@ -86,7 +95,8 @@ def __init__( output = layer(inputs) output.shape # TensorShape([1, 64, 64, 32]) ``` - """ + """ # noqa: E501 + super().__init__(**kwargs) self.input_filters = input_filters self.output_filters = output_filters diff --git a/keras_cv/layers/mbconv_test.py b/keras_cv/layers/mbconv_test.py index b4f24f84a7..f5999dcf74 100644 --- a/keras_cv/layers/mbconv_test.py +++ b/keras_cv/layers/mbconv_test.py @@ -25,7 +25,8 @@ def cleanup_global_session(self): # Code before yield runs before the test tf.config.set_soft_device_placement(False) yield - # Reset soft device placement to not interfere with other unit test files + # Reset soft device placement to not interfere with other unit test + # files tf.config.set_soft_device_placement(True) keras.backend.clear_session() diff --git a/keras_cv/layers/object_detection/anchor_generator.py b/keras_cv/layers/object_detection/anchor_generator.py index 3df8b8a127..afebe8b915 100644 --- a/keras_cv/layers/object_detection/anchor_generator.py +++ b/keras_cv/layers/object_detection/anchor_generator.py @@ -22,13 +22,13 @@ class AnchorGenerator(keras.layers.Layer): """AnchorGenerator generates anchors for multiple feature maps. - AnchorGenerator takes multiple scales and generates anchor boxes based on the anchor - sizes, scales, aspect ratios, and strides provided. To invoke AnchorGenerator, call - it on the image that needs anchor boxes. + AnchorGenerator takes multiple scales and generates anchor boxes based on + the anchor sizes, scales, aspect ratios, and strides provided. To invoke + AnchorGenerator, call it on the image that needs anchor boxes. - `sizes` and `strides` must match structurally - they are pairs. Scales and - aspect ratios can either be a list, that is then used for all of the sizes - (aka levels), or a dictionary from `{'level_{number}': [parameters at scale...]}`. + `sizes` and `strides` must match structurally - they are pairs. Scales and + aspect ratios can either be a list, that is then used for all the sizes (aka + levels), or a dictionary from `{'level_{number}': [parameters at scale...]}` Args: bounding_box_format: The format of bounding boxes to generate. Refer @@ -36,16 +36,18 @@ class AnchorGenerator(keras.layers.Layer): for more details on supported bounding box formats. sizes: A list of integers that represent the anchor sizes for each level, or a dictionary of integer lists with each key representing a level. - For each anchor size, anchor height will be `anchor_size / sqrt(aspect_ratio)`, - and anchor width will be `anchor_size * sqrt(aspect_ratio)`. This is repeated - for each scale and aspect ratio. + For each anchor size, anchor height will be + `anchor_size / sqrt(aspect_ratio)`, and anchor width will be + `anchor_size * sqrt(aspect_ratio)`. This is repeated for each scale and + aspect ratio. scales: A list of floats corresponding to multipliers that will be multiplied by each `anchor_size` to generate a level. - aspect_ratios: A list of floats representing the ratio of anchor width to height. + aspect_ratios: A list of floats representing the ratio of anchor width to + height. strides: iterable of ints that represent the anchor stride size between center of anchors at each scale. - clip_boxes: Whether or not to clip generated anchor boxes to the image size. - Defaults to `False`. + clip_boxes: whether to clip generated anchor boxes to the image + size, defaults to `False`. Usage: ```python @@ -69,10 +71,10 @@ class AnchorGenerator(keras.layers.Layer): ``` Input shape: an image with shape `[H, W, C]` - Output: a dictionary with integer keys corresponding to each level of the feature - pyramid. The size of the anchors at each level will be + Output: a dictionary with integer keys corresponding to each level of the + feature pyramid. The size of the anchors at each level will be `(H/strides[i] * W/strides[i] * len(scales) * len(aspect_ratios), 4)`. - """ + """ # noqa: E501 def __init__( self, @@ -118,7 +120,7 @@ def _format_sizes_and_strides(sizes, strides): if sorted(result_strides.keys()) != sorted(result_sizes.keys()): raise ValueError( "Expected sizes and strides to be either lists of" - "the same length, or dictionaries with the same keys. Received " + "the same length, or dictionaries with the same keys. Received " f"sizes={sizes}, strides={strides}" ) @@ -167,7 +169,7 @@ def __call__(self, image=None, image_shape=None): if image is not None: if image.shape.rank != 3: raise ValueError( - "Expected `image` to be a Tensor of rank 3. Got " + "Expected `image` to be a Tensor of rank 3. Got " f"image.shape.rank={image.shape.rank}" ) image_shape = tf.shape(image) @@ -187,7 +189,8 @@ def __call__(self, image=None, image_shape=None): # TODO(tanzheny): consider having customized anchor offset. class _SingleAnchorGenerator: - """Internal utility to generate anchors for a single feature map in `yxyx` format. + """Internal utility to generate anchors for a single feature map in `yxyx` + format. Example: ```python @@ -210,8 +213,8 @@ class _SingleAnchorGenerator: stride: A single int represents the anchor stride size between center of each anchor. clip_boxes: Boolean to represent whether the anchor coordinates should be - clipped to the image size. Defaults to `False`. - dtype: (Optional) The data type to use for the output anchors. Defaults to + clipped to the image size, defaults to `False`. + dtype: (Optional) The data type to use for the output anchors, defaults to 'float32'. """ @@ -256,12 +259,12 @@ def __call__(self, image_size): half_anchor_widths = tf.reshape(0.5 * anchor_widths, [1, 1, -1]) stride = tf.cast(self.stride, tf.float32) - # make sure range of `cx` is within limit of `image_width` with `stride`, - # also for sizes where `image_width % stride != 0`. + # make sure range of `cx` is within limit of `image_width` with + # `stride`, also for sizes where `image_width % stride != 0`. # [W] cx = tf.range(0.5 * stride, (image_width // stride) * stride, stride) - # make sure range of `cy` is within limit of `image_height` with `stride`, - # also for sizes where `image_height % stride != 0`. + # make sure range of `cy` is within limit of `image_height` with + # `stride`, also for sizes where `image_height % stride != 0`. # [H] cy = tf.range(0.5 * stride, (image_height // stride) * stride, stride) # [H, W] diff --git a/keras_cv/layers/object_detection/box_matcher.py b/keras_cv/layers/object_detection/box_matcher.py index 83a5a87418..4af3f18405 100644 --- a/keras_cv/layers/object_detection/box_matcher.py +++ b/keras_cv/layers/object_detection/box_matcher.py @@ -29,23 +29,25 @@ class BoxMatcher(keras.layers.Layer): The settings include `thresholds` and `match_values`, for example if: 1) thresholds=[negative_threshold, positive_threshold], and - match_values=[negative_value=0, ignore_value=-1, positive_value=1]: the rows will - be assigned to positive_value if its argmax result >= + match_values=[negative_value=0, ignore_value=-1, positive_value=1]: the + rows will be assigned to positive_value if its argmax result >= positive_threshold; the rows will be assigned to negative_value if its - argmax result < negative_threshold, and the rows will be assigned - to ignore_value if its argmax result is between [negative_threshold, positive_threshold). + argmax result < negative_threshold, and the rows will be assigned to + ignore_value if its argmax result is between [negative_threshold, + positive_threshold). 2) thresholds=[negative_threshold, positive_threshold], and - match_values=[ignore_value=-1, negative_value=0, positive_value=1]: the rows will - be assigned to positive_value if its argmax result >= + match_values=[ignore_value=-1, negative_value=0, positive_value=1]: the + rows will be assigned to positive_value if its argmax result >= positive_threshold; the rows will be assigned to ignore_value if its - argmax result < negative_threshold, and the rows will be assigned - to negative_value if its argmax result is between [negative_threshold ,positive_threshold). - This is different from case 1) by swapping first two + argmax result < negative_threshold, and the rows will be assigned to + negative_value if its argmax result is between [negative_threshold, + positive_threshold). This is different from case 1) by swapping first two values. 3) thresholds=[positive_threshold], and - match_values=[negative_values, positive_value]: the rows will be assigned to - positive value if its argmax result >= positive_threshold; the rows - will be assigned to negative_value if its argmax result < negative_threshold. + match_values=[negative_values, positive_value]: the rows will be assigned + to positive value if its argmax result >= positive_threshold; the rows + will be assigned to negative_value if its argmax result < + negative_threshold. Args: thresholds: A sorted list of floats to classify the matches into @@ -125,14 +127,16 @@ def call(self, similarity_matrix: tf.Tensor) -> Tuple[tf.Tensor, tf.Tensor]: def _match_when_cols_are_empty(): """Performs matching when the rows of similarity matrix are empty. - When the rows are empty, all detections are false positives. So we return - a tensor of -1's to indicate that the rows do not match to any columns. + When the rows are empty, all detections are false positives. So we + return a tensor of -1's to indicate that the rows do not match to + any columns. Returns: - matched_columns: An integer tensor of shape [batch_size, num_rows] - storing the index of the matched column for each row. - matched_values: An integer tensor of shape [batch_size, num_rows] - storing the match type indicator (e.g. positive or negative - or ignored match). + matched_columns: An integer tensor of shape [batch_size, + num_rows] storing the index of the matched column for each + row. + matched_values: An integer tensor of shape [batch_size, + num_rows] storing the match type indicator (e.g. positive or + negative or ignored match). """ with tf.name_scope("empty_boxes"): matched_columns = tf.zeros( @@ -144,20 +148,23 @@ def _match_when_cols_are_empty(): return matched_columns, matched_values def _match_when_cols_are_non_empty(): - """Performs matching when the rows of similarity matrix are non empty. + """Performs matching when the rows of similarity matrix are + non-empty. Returns: - matched_columns: An integer tensor of shape [batch_size, num_rows] - storing the index of the matched column for each row. - matched_values: An integer tensor of shape [batch_size, num_rows] - storing the match type indicator (e.g. positive or negative - or ignored match). + matched_columns: An integer tensor of shape [batch_size, + num_rows] storing the index of the matched column for each + row. + matched_values: An integer tensor of shape [batch_size, + num_rows] storing the match type indicator (e.g. positive or + negative or ignored match). """ with tf.name_scope("non_empty_boxes"): matched_columns = tf.argmax( similarity_matrix, axis=-1, output_type=tf.int32 ) - # Get logical indices of ignored and unmatched columns as tf.int64 + # Get logical indices of ignored and unmatched columns as + # tf.int64 matched_vals = tf.reduce_max(similarity_matrix, axis=-1) matched_values = tf.zeros([batch_size, num_rows], tf.int32) @@ -176,18 +183,19 @@ def _match_when_cols_are_non_empty(): ) if self.force_match_for_each_col: - # [batch_size, num_cols], for each column (groundtruth_box), find the - # best matching row (anchor). + # [batch_size, num_cols], for each column (groundtruth_box), + # find the best matching row (anchor). matching_rows = tf.argmax( input=similarity_matrix, axis=1, output_type=tf.int32 ) - # [batch_size, num_cols, num_rows], a transposed 0-1 mapping matrix M, - # where M[j, i] = 1 means column j is matched to row i. + # [batch_size, num_cols, num_rows], a transposed 0-1 mapping + # matrix M, where M[j, i] = 1 means column j is matched to + # row i. column_to_row_match_mapping = tf.one_hot( matching_rows, depth=num_rows ) - # [batch_size, num_rows], for each row (anchor), find the matched - # column (groundtruth_box). + # [batch_size, num_rows], for each row (anchor), find the + # matched column (groundtruth_box). force_matched_columns = tf.argmax( input=column_to_row_match_mapping, axis=1, diff --git a/keras_cv/layers/object_detection/box_matcher_test.py b/keras_cv/layers/object_detection/box_matcher_test.py index 021f99a8ef..99cc6c3908 100644 --- a/keras_cv/layers/object_detection/box_matcher_test.py +++ b/keras_cv/layers/object_detection/box_matcher_test.py @@ -107,8 +107,8 @@ def test_box_matcher_force_match(self): self.assertAllEqual( negative_matches.numpy(), [False, False, False, False] ) - # the first anchor cannot be matched to 4th gt box given that is matched to - # the last anchor. + # the first anchor cannot be matched to 4th gt box given that is matched + # to the last anchor. self.assertAllEqual(match_indices.numpy(), [1, 2, 0, 3]) self.assertAllEqual(matched_values.numpy(), [1, 1, 1, 1]) diff --git a/keras_cv/layers/object_detection/multi_class_non_max_suppression.py b/keras_cv/layers/object_detection/multi_class_non_max_suppression.py index f272c1068f..2273a24805 100644 --- a/keras_cv/layers/object_detection/multi_class_non_max_suppression.py +++ b/keras_cv/layers/object_detection/multi_class_non_max_suppression.py @@ -25,19 +25,20 @@ class MultiClassNonMaxSuppression(keras.layers.Layer): Arguments: bounding_box_format: The format of bounding boxes of input dataset. Refer - [to the keras.io docs](https://keras.io/api/keras_cv/bounding_box/formats/) - for more details on supported bounding box formats. - from_logits: boolean, True means input score is logits, False means confidence. + [to the keras.io docs](https://keras.io/api/keras_cv/bounding_box/formats/) for more details on supported bounding box + formats. + from_logits: boolean, True means input score is logits, False means + confidence. iou_threshold: a float value in the range [0, 1] representing the minimum - IoU threshold for two boxes to be considered same for suppression. Defaults - to 0.5. + IoU threshold for two boxes to be considered same for suppression. + Defaults to 0.5. confidence_threshold: a float value in the range [0, 1]. All boxes with - confidence below this value will be discarded. Defaults to 0.9. - max_detections: the maximum detections to consider after nms is applied. A large - number may trigger significant memory overhead. Defaults to 100. - max_detections_per_class: the maximum detections to consider per class after - nms is applied. Defaults to 100. - """ + confidence below this value will be discarded, defaults to 0.9. + max_detections: the maximum detections to consider after nms is applied. A + large number may trigger significant memory overhead, defaults to 100. + max_detections_per_class: the maximum detections to consider per class + after nms is applied, defaults to 100. + """ # noqa: E501 def __init__( self, @@ -59,7 +60,8 @@ def __init__( self.built = True def call(self, box_prediction, class_prediction): - """Accepts images and raw predictions, and returns bounding box predictions. + """Accepts images and raw predictions, and returns bounding box + predictions. Args: box_prediction: Dense Tensor of shape [batch, boxes, 4] in the diff --git a/keras_cv/layers/object_detection/roi_align.py b/keras_cv/layers/object_detection/roi_align.py index 686d729402..a1ccc09609 100644 --- a/keras_cv/layers/object_detection/roi_align.py +++ b/keras_cv/layers/object_detection/roi_align.py @@ -39,8 +39,8 @@ def _feature_bilinear_interpolation( kernel_x = [hx, lx] Args: - features: The features are in shape of [batch_size, num_boxes, output_size * - 2, output_size * 2, num_filters]. + features: The features are in shape of [batch_size, num_boxes, + output_size * 2, output_size * 2, num_filters]. kernel_y: Tensor of size [batch_size, boxes, output_size, 2, 1]. kernel_x: Tensor of size [batch_size, boxes, output_size, 2, 1]. @@ -92,13 +92,14 @@ def _compute_grid_positions( information of each box w.r.t. the corresponding feature map. boxes[:, :, 0:2] are the grid position in (y, x) (float) of the top-left corner of each box. boxes[:, :, 2:4] are the box sizes in (h, w) (float) - in terms of the number of pixels of the corresponding feature map size. + in terms of the number of pixels of the corresponding feature map + size. boundaries: a 3-D tensor of shape [batch_size, num_boxes, 2] representing the boundary (in (y, x)) of the corresponding feature map for each box. - Any resampled grid points that go beyond the bounary will be clipped. + Any resampled grid points that go beyond the boundary will be clipped. output_size: a scalar indicating the output crop size. - sample_offset: a float number in [0, 1] indicates the subpixel sample offset - from grid point. + sample_offset: a float number in [0, 1] indicates the subpixel sample + offset from grid point. Returns: kernel_y: Tensor of size [batch_size, boxes, output_size, 2, 1]. @@ -173,16 +174,17 @@ def multilevel_crop_and_resize( Generate the (output_size, output_size) set of pixels for each input box by first locating the box into the correct feature level, and then cropping - and resizing it using the correspoding feature map of that level. + and resizing it using the corresponding feature map of that level. Args: - features: A dictionary with key as pyramid level and value as features. The - features are in shape of [batch_size, height_l, width_l, num_filters]. - boxes: A 3-D Tensor of shape [batch_size, num_boxes, 4]. Each row represents - a box with [y1, x1, y2, x2] in un-normalized coordinates. + features: A dictionary with key as pyramid level and value as features. + The features are in shape of [batch_size, height_l, width_l, + num_filters]. + boxes: A 3-D Tensor of shape [batch_size, num_boxes, 4]. Each row + represents a box with [y1, x1, y2, x2] in un-normalized coordinates. output_size: A scalar to indicate the output crop size. - sample_offset: a float number in [0, 1] indicates the subpixel sample offset - from grid point. + sample_offset: a float number in [0, 1] indicates the subpixel sample + offset from grid point. Returns: A 5-D tensor representing feature crop of shape @@ -212,8 +214,8 @@ def multilevel_crop_and_resize( shape = features[level].get_shape().as_list() feature_heights.append(shape[1]) feature_widths.append(shape[2]) - # Concat tensor of [batch_size, height_l * width_l, num_filters] for each - # levels. + # Concat tensor of [batch_size, height_l * width_l, num_filters] for + # each level. features_all.append( tf.reshape(features[level], [batch_size, -1, num_filters]) ) @@ -343,8 +345,8 @@ def multilevel_crop_and_resize( [-1], ) - # TODO(tanzhenyu): replace tf.gather with tf.gather_nd and try to get similar - # performance. + # TODO(tanzhenyu): replace tf.gather with tf.gather_nd and try to get + # similar performance. features_per_box = tf.reshape( tf.gather(features_r2, indices), [ @@ -363,9 +365,9 @@ def multilevel_crop_and_resize( return features_per_box -# TODO(tanzhenyu): Remove this implementation once roi_pool has better performance. -# as this is mostly a duplicate of -# https://github.com/tensorflow/models/blob/master/official/legacy/detection/ops/spatial_transform_ops.py#L324 +# TODO(tanzhenyu): Remove this implementation once roi_pool has better +# performance as this is mostly a duplicate of +# https://github.com/tensorflow/models/blob/master/official/legacy/detection/ops/spatial_transform_ops.py#L324 @keras.utils.register_keras_serializable(package="keras_cv") class _ROIAligner(keras.layers.Layer): """Performs ROIAlign for the second stage processing.""" @@ -402,8 +404,8 @@ def call( """ Args: - features: A dictionary with key as pyramid level and value as features. - The features are in shape of + features: A dictionary with key as pyramid level and value as + features. The features are in shape of [batch_size, height_l, width_l, num_filters]. boxes: A 3-D `tf.Tensor` of shape [batch_size, num_boxes, 4]. Each row represents a box with [y1, x1, y2, x2] in un-normalized coordinates. diff --git a/keras_cv/layers/object_detection/roi_generator.py b/keras_cv/layers/object_detection/roi_generator.py index 94c1af33e0..b6555e58d8 100644 --- a/keras_cv/layers/object_detection/roi_generator.py +++ b/keras_cv/layers/object_detection/roi_generator.py @@ -47,21 +47,29 @@ class ROIGenerator(keras.layers.Layer): bounding_box_format: a case-insensitive string. For detailed information on the supported format, see the [KerasCV bounding box documentation](https://keras.io/api/keras_cv/bounding_box/formats/). - pre_nms_topk_train: int. number of top k scoring proposals to keep before applying NMS in training mode. - When RPN is run on multiple feature maps / levels (as in FPN) this number is per + pre_nms_topk_train: int. number of top k scoring proposals to keep + before applying NMS in training mode. When RPN is run on multiple + feature maps / levels (as in FPN) this number is per feature map / level. - nms_score_threshold_train: float. score threshold to use for NMS in training mode. - nms_iou_threshold_train: float. IOU threshold to use for NMS in training mode. - post_nms_topk_train: int. number of top k scoring proposals to keep after applying NMS in training mode. - When RPN is run on multiple feature maps / levels (as in FPN) this number is per + nms_score_threshold_train: float. score threshold to use for NMS in + training mode. + nms_iou_threshold_train: float. IOU threshold to use for NMS in training + mode. + post_nms_topk_train: int. number of top k scoring proposals to keep + after applying NMS in training mode. When RPN is run on multiple + feature maps / levels (as in FPN) this number is per feature map / level. - pre_nms_topk_test: int. number of top k scoring proposals to keep before applying NMS in inference mode. - When RPN is run on multiple feature maps / levels (as in FPN) this number is per + pre_nms_topk_test: int. number of top k scoring proposals to keep before + applying NMS in inference mode. When RPN is run on multiple + feature maps / levels (as in FPN) this number is per feature map / level. - nms_score_threshold_test: float. score threshold to use for NMS in inference mode. - nms_iou_threshold_test: float. IOU threshold to use for NMS in inference mode. - post_nms_topk_test: int. number of top k scoring proposals to keep after applying NMS in inference mode. - When RPN is run on multiple feature maps / levels (as in FPN) this number is per + nms_score_threshold_test: float. score threshold to use for NMS in + inference mode. + nms_iou_threshold_test: float. IOU threshold to use for NMS in inference + mode. + post_nms_topk_test: int. number of top k scoring proposals to keep after + applying NMS in inference mode. When RPN is run on multiple + feature maps / levels (as in FPN) this number is per feature map / level. Usage: @@ -72,7 +80,7 @@ class ROIGenerator(keras.layers.Layer): rois, roi_scores = roi_generator(boxes, scores, training=True) ``` - """ + """ # noqa: E501 def __init__( self, @@ -107,12 +115,14 @@ def call( ) -> Tuple[tf.Tensor, tf.Tensor]: """ Args: - multi_level_boxes: float Tensor. A dictionary or single Tensor of boxes, one per level. shape is - [batch_size, num_boxes, 4] each level, in `bounding_box_format`. - The boxes from RPNs are usually encoded as deltas w.r.t to anchors, - they need to be decoded before passing in here. - multi_level_scores: float Tensor. A dictionary or single Tensor of scores, usually confidence scores, - one per level. shape is [batch_size, num_boxes] each level. + multi_level_boxes: float Tensor. A dictionary or single Tensor of + boxes, one per level. Shape is [batch_size, num_boxes, 4] each + level, in `bounding_box_format`. The boxes from RPNs are usually + encoded as deltas w.r.t to anchors, they need to be decoded before + passing in here. + multi_level_scores: float Tensor. A dictionary or single Tensor of + scores, typically confidence scores, one per level. Shape is + [batch_size, num_boxes] each level. Returns: rois: float Tensor of [batch_size, post_nms_topk, 4] diff --git a/keras_cv/layers/object_detection/roi_generator_test.py b/keras_cv/layers/object_detection/roi_generator_test.py index 0ba74d6596..f6442a6076 100644 --- a/keras_cv/layers/object_detection/roi_generator_test.py +++ b/keras_cv/layers/object_detection/roi_generator_test.py @@ -132,8 +132,10 @@ def test_single_level_propose_rois(self): expected_rois = tf.concat([expected_rois, tf.zeros([2, 1, 4])], axis=1) rpn_boxes = {2: rpn_boxes} rpn_scores = tf.constant([[0.6, 0.9, 0.2, 0.3], [0.1, 0.8, 0.3, 0.5]]) - # 1st batch -- selecting the 1st, then 3rd, then 2nd as they don't overlap - # 2nd batch -- selecting the 1st, then 3rd, then 0th as they don't overlap + # 1st batch -- selecting the 1st, then 3rd, then 2nd as they don't + # overlap + # 2nd batch -- selecting the 1st, then 3rd, then 0th as they don't + # overlap expected_roi_scores = tf.gather( rpn_scores, [[1, 3, 2], [1, 3, 0]], batch_dims=1 ) @@ -179,8 +181,10 @@ def test_two_level_single_batch_propose_rois_ignore_box(self): ) rpn_boxes = {2: rpn_boxes[0:1], 3: rpn_boxes[1:2]} rpn_scores = tf.constant([[0.6, 0.9, 0.2, 0.3], [0.1, 0.8, 0.3, 0.5]]) - # 1st batch -- selecting the 1st, then 3rd, then 2nd as they don't overlap - # 2nd batch -- selecting the 1st, then 3rd, then 0th as they don't overlap + # 1st batch -- selecting the 1st, then 3rd, then 2nd as they don't + # overlap + # 2nd batch -- selecting the 1st, then 3rd, then 0th as they don't + # overlap expected_roi_scores = [ [ 0.9, @@ -232,8 +236,10 @@ def test_two_level_single_batch_propose_rois_all_box(self): ) rpn_boxes = {2: rpn_boxes[0:1], 3: rpn_boxes[1:2]} rpn_scores = tf.constant([[0.6, 0.9, 0.2, 0.3], [0.1, 0.8, 0.3, 0.5]]) - # 1st batch -- selecting the 1st, then 0th, then 3rd, then 2nd as they don't overlap - # 2nd batch -- selecting the 1st, then 3rd, then 2nd, then 0th as they don't overlap + # 1st batch -- selecting the 1st, then 0th, then 3rd, then 2nd as they + # don't overlap + # 2nd batch -- selecting the 1st, then 3rd, then 2nd, then 0th as they + # don't overlap expected_roi_scores = [ [ 0.9, diff --git a/keras_cv/layers/object_detection/roi_pool.py b/keras_cv/layers/object_detection/roi_pool.py index b44e37af7e..449f77927a 100644 --- a/keras_cv/layers/object_detection/roi_pool.py +++ b/keras_cv/layers/object_detection/roi_pool.py @@ -21,20 +21,23 @@ @keras.utils.register_keras_serializable(package="keras_cv") class ROIPooler(keras.layers.Layer): """ - Pooling feature map of dynamic shape into region of interest (ROI) of fixed shape. + Pooling feature map of dynamic shape into region of interest (ROI) of fixed + shape. Mainly used in Region CNN (RCNN) networks. This works for a single-level input feature map. - This layer splits the feature map into [target_size[0], target_size[1]] areas, - and performs max pooling for each area. The area coordinates will be quantized. + This layer splits the feature map into [target_size[0], target_size[1]] + areas, and performs max pooling for each area. The area coordinates will be + quantized. Args: bounding_box_format: a case-insensitive string. For detailed information on the supported format, see the [KerasCV bounding box documentation](https://keras.io/api/keras_cv/bounding_box/formats/). target_size: List or Tuple of 2 integers of the pooled shape - image_shape: List of Tuple of 3 integers, or `TensorShape` of the input image shape. + image_shape: List of Tuple of 3 integers, or `TensorShape` of the input + image shape. Usage: ```python @@ -44,7 +47,7 @@ class ROIPooler(keras.layers.Layer): rois = tf.constant([[[15., 30., 25., 45.]], [[22., 1., 30., 32.]]]) pooled_feature_map = roi_pooler(feature_map, rois) ``` - """ + """ # noqa: E501 def __init__( self, @@ -56,7 +59,8 @@ def __init__( ): if not isinstance(target_size, (tuple, list)): raise ValueError( - f"Expected `target_size` to be tuple or list, got {type(target_size)}" + "Expected `target_size` to be tuple or list, got " + f"{type(target_size)}" ) if len(target_size) != 2: raise ValueError( @@ -80,8 +84,10 @@ def __init__( def call(self, feature_map, rois): """ Args: - feature_map: [batch_size, H, W, C] float Tensor, the feature map extracted from image. - rois: [batch_size, N, 4] float Tensor, the region of interests to be pooled. + feature_map: [batch_size, H, W, C] float Tensor, the feature map + extracted from image. + rois: [batch_size, N, 4] float Tensor, the region of interests to be + pooled. Returns: pooled_feature_map: [batch_size, N, target_size, C] float Tensor """ diff --git a/keras_cv/layers/object_detection/roi_pool_test.py b/keras_cv/layers/object_detection/roi_pool_test.py index 90d6011935..052aae7f14 100644 --- a/keras_cv/layers/object_detection/roi_pool_test.py +++ b/keras_cv/layers/object_detection/roi_pool_test.py @@ -27,7 +27,8 @@ def test_no_quantize(self): ) rois = tf.reshape(tf.constant([0.0, 0.0, 1.0, 1.0]), [1, 1, 4]) pooled_feature_map = roi_pooler(feature_map, rois) - # the maximum value would be at bottom-right at each block, roi sharded into 2x2 blocks + # the maximum value would be at bottom-right at each block, roi sharded + # into 2x2 blocks # | 0, 1, 2, 3 | 4, 5, 6, 7 | # | 8, 9, 10, 11 | 12, 13, 14, 15 | # | 16, 17, 18, 19 | 20, 21, 22, 23 | @@ -52,7 +53,8 @@ def test_roi_quantize_y(self): ) rois = tf.reshape(tf.constant([0.0, 0.0, 224, 220]), [1, 1, 4]) pooled_feature_map = roi_pooler(feature_map, rois) - # the maximum value would be at bottom-right at each block, roi sharded into 2x2 blocks + # the maximum value would be at bottom-right at each block, roi sharded + # into 2x2 blocks # | 0, 1, 2 | 3, 4, 5, 6 | 7 (removed) # | 8, 9, 10 | 11, 12, 13, 14 | 15 (removed) # | 16, 17, 18 | 19, 20, 21, 22 | 23 (removed) @@ -77,7 +79,8 @@ def test_roi_quantize_x(self): ) rois = tf.reshape(tf.constant([0.0, 0.0, 220, 224]), [1, 1, 4]) pooled_feature_map = roi_pooler(feature_map, rois) - # the maximum value would be at bottom-right at each block, roi sharded into 2x2 blocks + # the maximum value would be at bottom-right at each block, roi sharded + # into 2x2 blocks # | 0, 1, 2, 3 | 4, 5, 6, 7 | # | 8, 9, 10, 11 | 12, 13, 14, 15 | # | 16, 17, 18, 19(max) | 20, 21, 22, 23(max) | @@ -101,7 +104,8 @@ def test_roi_quantize_h(self): ) rois = tf.reshape(tf.constant([0.0, 0.0, 224, 224]), [1, 1, 4]) pooled_feature_map = roi_pooler(feature_map, rois) - # the maximum value would be at bottom-right at each block, roi sharded into 3x2 blocks + # the maximum value would be at bottom-right at each block, roi sharded + # into 3x2 blocks # | 0, 1, 2, 3 | 4, 5, 6, 7 | # | 8, 9, 10, 11(max) | 12, 13, 14, 15(max) | # -------------------------------------------- @@ -127,7 +131,8 @@ def test_roi_quantize_w(self): ) rois = tf.reshape(tf.constant([0.0, 0.0, 224, 224]), [1, 1, 4]) pooled_feature_map = roi_pooler(feature_map, rois) - # the maximum value would be at bottom-right at each block, roi sharded into 2x3 blocks + # the maximum value would be at bottom-right at each block, roi sharded + # into 2x3 blocks # | 0, 1 | 2, 3, 4 | 5, 6, 7 | # | 8, 9 | 10, 11, 12 | 13, 14, 15 | # | 16, 17 | 18, 19, 20 | 21, 22, 23 | diff --git a/keras_cv/layers/object_detection/roi_sampler.py b/keras_cv/layers/object_detection/roi_sampler.py index b3c7ab489f..a7aa6279d0 100644 --- a/keras_cv/layers/object_detection/roi_sampler.py +++ b/keras_cv/layers/object_detection/roi_sampler.py @@ -25,7 +25,7 @@ @keras.utils.register_keras_serializable(package="keras_cv") class _ROISampler(keras.layers.Layer): """ - Sample ROIs for loss related calucation. + Sample ROIs for loss related calculation. With proposals (ROIs) and ground truth, it performs the following: 1) compute IOU similarity matrix @@ -34,28 +34,29 @@ class _ROISampler(keras.layers.Layer): `append_gt_boxes` augments proposals with ground truth boxes. This is useful in 2 stage detection networks during initialization where the - 1st stage often cannot produce good proposals for 2nd stage. Setting it - to True will allow it to generate more reasonable proposals at the begining. + 1st stage often cannot produce good proposals for 2nd stage. Setting it to + True will allow it to generate more reasonable proposals at the beginning. - `background_class` allow users to set the labels for background proposals. Default - is 0, where users need to manually shift the incoming `gt_classes` if its range is - [0, num_classes). + `background_class` allow users to set the labels for background proposals. + Default is 0, where users need to manually shift the incoming `gt_classes` + if its range is [0, num_classes). Args: bounding_box_format: The format of bounding boxes to generate. Refer [to the keras.io docs](https://keras.io/api/keras_cv/bounding_box/formats/) for more details on supported bounding box formats. - roi_matcher: a `BoxMatcher` object that matches proposals - with ground truth boxes. the positive match must be 1 and negative match must be -1. + roi_matcher: a `BoxMatcher` object that matches proposals with ground + truth boxes. The positive match must be 1 and negative match must be -1. Such assumption is not being validated here. - positive_fraction: the positive ratio w.r.t `num_sampled_rois`. Defaults to 0.25. - background_class: the background class which is used to map returned the sampled - ground truth which is classified as background. + positive_fraction: the positive ratio w.r.t `num_sampled_rois`, defaults + to 0.25. + background_class: the background class which is used to map returned the + sampled ground truth which is classified as background. num_sampled_rois: the number of sampled proposals per image for - further (loss) calculation. Defaults to 256. + further (loss) calculation, defaults to 256. append_gt_boxes: boolean, whether gt_boxes will be appended to rois - before sample the rois. Defaults to True. - """ + before sample the rois, defaults to True. + """ # noqa: E501 def __init__( self, @@ -107,7 +108,8 @@ def call( ) if num_rois < self.num_sampled_rois: raise ValueError( - f"num_rois must be less than `num_sampled_rois` ({self.num_sampled_rois}), got {num_rois}" + "num_rois must be less than `num_sampled_rois` " + f"({self.num_sampled_rois}), got {num_rois}" ) rois = bounding_box.convert_format( rois, source=self.bounding_box_format, target="yxyx" diff --git a/keras_cv/layers/object_detection/roi_sampler_test.py b/keras_cv/layers/object_detection/roi_sampler_test.py index 43332fbc07..80f822b93e 100644 --- a/keras_cv/layers/object_detection/roi_sampler_test.py +++ b/keras_cv/layers/object_detection/roi_sampler_test.py @@ -92,7 +92,7 @@ def test_roi_sampler_small_threshold(self): sampled_rois, sampled_gt_boxes, _, sampled_gt_classes, _ = roi_sampler( rois, gt_boxes, gt_classes ) - # given we only choose 1 positive sample, and `append_labesl` is False, + # given we only choose 1 positive sample, and `append_label` is False, # only the 2nd ROI is chosen. No negative samples exist given we # select positive_threshold to be 0.1. (the minimum IOU is 1/7) # given num_sampled_rois=2, it selects the 1st ROI as well. @@ -119,7 +119,8 @@ def test_roi_sampler_small_threshold(self): self.assertAllClose(expected_gt_classes, sampled_gt_classes) def test_roi_sampler_large_threshold(self): - # the 2nd roi and 2nd gt box has IOU of 0.923, setting positive_threshold to 0.95 to ignore it + # the 2nd roi and 2nd gt box has IOU of 0.923, setting + # positive_threshold to 0.95 to ignore it. box_matcher = BoxMatcher(thresholds=[0.95], match_values=[-1, 1]) roi_sampler = _ROISampler( bounding_box_format="xyxy", @@ -157,7 +158,8 @@ def test_roi_sampler_large_threshold(self): self.assertAllClose(expected_gt_classes, sampled_gt_classes) def test_roi_sampler_large_threshold_custom_bg_class(self): - # the 2nd roi and 2nd gt box has IOU of 0.923, setting positive_threshold to 0.95 to ignore it + # the 2nd roi and 2nd gt box has IOU of 0.923, setting + # positive_threshold to 0.95 to ignore it. box_matcher = BoxMatcher(thresholds=[0.95], match_values=[-1, 1]) roi_sampler = _ROISampler( bounding_box_format="xyxy", @@ -188,7 +190,8 @@ def test_roi_sampler_large_threshold_custom_bg_class(self): ) # all ROIs are negative matches, so they are mapped to 0. expected_gt_boxes = tf.zeros([1, 2, 4], dtype=tf.float32) - # only the 2nd ROI is chosen, and the negative ROI is mapped to -1 from customization. + # only the 2nd ROI is chosen, and the negative ROI is mapped to -1 from + # customization. expected_gt_classes = tf.constant([[-1], [-1]], dtype=tf.int32) expected_gt_classes = expected_gt_classes[tf.newaxis, ...] # self.assertAllClose(expected_rois, sampled_rois) @@ -196,7 +199,8 @@ def test_roi_sampler_large_threshold_custom_bg_class(self): self.assertAllClose(expected_gt_classes, sampled_gt_classes) def test_roi_sampler_large_threshold_append_gt_boxes(self): - # the 2nd roi and 2nd gt box has IOU of 0.923, setting positive_threshold to 0.95 to ignore it + # the 2nd roi and 2nd gt box has IOU of 0.923, setting + # positive_threshold to 0.95 to ignore it. box_matcher = BoxMatcher(thresholds=[0.95], match_values=[-1, 1]) roi_sampler = _ROISampler( bounding_box_format="xyxy", diff --git a/keras_cv/layers/object_detection/rpn_label_encoder.py b/keras_cv/layers/object_detection/rpn_label_encoder.py index 23addae3d7..eec6b33db2 100644 --- a/keras_cv/layers/object_detection/rpn_label_encoder.py +++ b/keras_cv/layers/object_detection/rpn_label_encoder.py @@ -26,13 +26,16 @@ @keras.utils.register_keras_serializable(package="keras_cv") class _RpnLabelEncoder(keras.layers.Layer): - """Transforms the raw labels into training targets for region proposal network (RPN). + """Transforms the raw labels into training targets for region proposal + network (RPN). # TODO(tanzhenyu): consider unifying with _ROISampler. This is different from _ROISampler for a couple of reasons: - 1) This deals with unbatched input, dict of anchors and potentially ragged labels - 2) This deals with ground truth boxes, while _ROISampler deals with padded ground truth - boxes with value -1 and padded ground truth classes with value -1 + 1) This deals with unbatched input, dict of anchors and potentially ragged + labels. + 2) This deals with ground truth boxes, while _ROISampler deals with padded + ground truth boxes with value -1 and padded ground truth classes with + value -1. 3) this returns positive class target as 1, while _ROISampler returns positive class target as-is. (All negative class target are 0) The final classification loss will use one hot and #num_fg_classes + 1 @@ -44,18 +47,19 @@ class _RpnLabelEncoder(keras.layers.Layer): Args: anchor_format: The format of bounding boxes for anchors to generate. Refer - [to the keras.io docs](https://keras.io/api/keras_cv/bounding_box/formats/) - for more details on supported bounding box formats. - ground_truth_box_format: The format of bounding boxes for ground truth boxes to generate. - positive_threshold: the float threshold to set an anchor to positive match to gt box. - values above it are positive matches. - negative_threshold: the float threshold to set an anchor to negative match to gt box. - values below it are negative matches. - samples_per_image: for each image, the number of positive and negative samples - to generate. + [to the keras.io docs](https://keras.io/api/keras_cv/bounding_box/formats/) for more details on supported bounding box + formats. + ground_truth_box_format: The format of bounding boxes for ground truth + boxes to generate. + positive_threshold: the float threshold to set an anchor to positive match + to gt box. Values above it are positive matches. + negative_threshold: the float threshold to set an anchor to negative match + to gt box. Values below it are negative matches. + samples_per_image: for each image, the number of positive and negative + samples to generate. positive_fraction: the fraction of positive samples to the total samples. - """ + """ # noqa: E501 def __init__( self, @@ -92,7 +96,7 @@ def call( ): """ Args: - anchors: dict of [num_anchors, 4] or [batch_size, num_anchors, 4] + anchors_dict: dict of [num_anchors, 4] or [batch_size, num_anchors, 4] float Tensor for each level. gt_boxes: [num_gt, 4] or [batch_size, num_anchors] float Tensor. gt_classes: [num_gt, 1] float or integer Tensor. @@ -121,17 +125,19 @@ def call( matched_gt_indices, matched_vals = self.box_matcher(similarity_mat) # [num_anchors] or [batch_size, num_anchors] positive_matches = tf.math.equal(matched_vals, 1) - # currently SyncOnReadVariable does not support `assign_add` in cross-replica. - # self._positives.update_state( - # tf.reduce_sum(tf.cast(positive_matches, tf.float32), axis=-1) - # ) + # currently SyncOnReadVariable does not support `assign_add` in + # cross-replica. + # self._positives.update_state( + # tf.reduce_sum(tf.cast(positive_matches, tf.float32), axis=-1) + # ) negative_matches = tf.math.equal(matched_vals, -1) # [num_anchors, 4] or [batch_size, num_anchors, 4] matched_gt_boxes = target_gather._target_gather( gt_boxes, matched_gt_indices ) - # [num_anchors, 4] or [batch_size, num_anchors, 4], used as `y_true` for regression loss + # [num_anchors, 4] or [batch_size, num_anchors, 4], used as `y_true` for + # regression loss encoded_box_targets = bounding_box._encode_box_to_deltas( anchors, matched_gt_boxes, @@ -146,8 +152,8 @@ def call( # [num_anchors, 1] or [batch_size, num_anchors, 1] positive_mask = tf.expand_dims(positive_matches, axis=-1) - # set all negative and ignored matches to 0, and all positive matches to 1 - # [num_anchors, 1] or [batch_size, num_anchors, 1] + # set all negative and ignored matches to 0, and all positive matches to + # 1 [num_anchors, 1] or [batch_size, num_anchors, 1] positive_classes = tf.ones_like(positive_mask, dtype=gt_classes.dtype) negative_classes = tf.zeros_like(positive_mask, dtype=gt_classes.dtype) # [num_anchors, 1] or [batch_size, num_anchors, 1] @@ -187,7 +193,8 @@ def unpack_targets(self, targets, anchors_dict): target_shape = len(targets.get_shape().as_list()) if target_shape != 2 and target_shape != 3: raise ValueError( - f"unpacking targets must be rank 2 or rank 3, got {target_shape}" + "unpacking targets must be rank 2 or rank 3, got " + f"{target_shape}" ) unpacked_targets = {} count = 0 diff --git a/keras_cv/layers/object_detection/sampling.py b/keras_cv/layers/object_detection/sampling.py index f6036c8d69..ce1674bfa4 100644 --- a/keras_cv/layers/object_detection/sampling.py +++ b/keras_cv/layers/object_detection/sampling.py @@ -43,9 +43,11 @@ def balanced_sample( N = positive_matches.get_shape().as_list()[-1] if N < num_samples: raise ValueError( - f"passed in {positive_matches.shape} has less element than {num_samples}" + "passed in {positive_matches.shape} has less element than " + f"{num_samples}" ) - # random_val = tf.random.uniform(tf.shape(positive_matches), minval=0., maxval=1.) + # random_val = tf.random.uniform(tf.shape(positive_matches), minval=0., + # maxval=1.) zeros = tf.zeros_like(positive_matches, dtype=tf.float32) ones = tf.ones_like(positive_matches, dtype=tf.float32) ones_rand = ones + tf.random.uniform(ones.shape, minval=-0.2, maxval=0.2) diff --git a/keras_cv/layers/object_detection/sampling_test.py b/keras_cv/layers/object_detection/sampling_test.py index 4270ff8bbf..62ef533504 100644 --- a/keras_cv/layers/object_detection/sampling_test.py +++ b/keras_cv/layers/object_detection/sampling_test.py @@ -84,9 +84,11 @@ def test_balanced_batched_sampling(self): res = balanced_sample( positive_matches, negative_matches, num_samples, positive_fraction ) - # the 1st element from the 1st batch must be selected, given it's the only one + # the 1st element from the 1st batch must be selected, given it's the + # only one self.assertAllClose(res[0][0], 1) - # the 7th element from the 2nd batch must be selected, given it's the only one + # the 7th element from the 2nd batch must be selected, given it's the + # only one self.assertAllClose(res[1][6], 1) def test_balanced_sampling_over_positive_fraction(self): @@ -186,7 +188,7 @@ def test_balanced_sampling_no_positive(self): False, ] ) - # the rest are neither positive nor negative, but ignord matches + # the rest are neither positive nor negative, but ignored matches negative_matches = tf.constant( [False, False, True, False, False, True, False, False, True, False] ) diff --git a/keras_cv/layers/object_detection_3d/center_net_label_encoder.py b/keras_cv/layers/object_detection_3d/center_net_label_encoder.py index ffc4023ee8..9cd6e5911e 100644 --- a/keras_cv/layers/object_detection_3d/center_net_label_encoder.py +++ b/keras_cv/layers/object_detection_3d/center_net_label_encoder.py @@ -70,9 +70,11 @@ def compute_heatmap( max_radius: the maximum radius on each voxel dimension (xyz) Returns: - point_xyz: the point location w.r.t. vehicle frame, [B, boxes, max_voxels_per_box, 3] + point_xyz: the point location w.r.t. vehicle frame, [B, boxes, + max_voxels_per_box, 3] mask: point mask, [B, boxes, max_voxels_per_box] - heatmap: the returned heatmap w.r.t box frame, [B, boxes, max_voxels_per_box] + heatmap: the returned heatmap w.r.t box frame, [B, boxes, + max_voxels_per_box] box_id: the box id each point belongs to, [B, boxes, max_voxels_per_box] """ @@ -117,8 +119,8 @@ def compute_heatmap( point_xyz_rot + voxel_utils.inv_loc(rot, box_center)[:, :, tf.newaxis, :] ) - # Due to the transform above, z=0 can be transformed to a non-zero value. For - # 2d headmap, we do not want to use z. + # Due to the transform above, z=0 can be transformed to a non-zero value. + # For 2d heatmap, we do not want to use z. if voxel_size[2] > INF_VOXEL_SIZE: point_xyz_transform = tf.concat( [ @@ -191,17 +193,17 @@ def scatter_to_dense_heatmap( Args: point_xyz: [B, N, 3] 3d points, point coordinate in vehicle frame. point_mask: [B, N] valid point mask. - point_box_id: [B, N] box id of each point. The ID indexes into the input box - tensors. See compute_heatmap for more details. + point_box_id: [B, N] box id of each point. The ID indexes into the input + box tensors. See compute_heatmap for more details. heatmap: [B, N] heatmap value of each point. voxel_size: voxel size. spatial_size: the spatial size. Returns: dense_heatmap: [B, H, W] heatmap value. - dense_box_id: [B, H, W] box id associated with each feature map pixel. Only - pixels with positive heatmap value have valid box id set. Other locations - have random values. + dense_box_id: [B, H, W] box id associated with each feature map pixel. + Only pixels with positive heatmap value have valid box id set. Other + locations have random values. """ # [B, N, 3] @@ -246,7 +248,8 @@ def fn(args): heatmap_i = tf.gather_nd(heatmap_i, mask_index) point_box_id_i = tf.gather_nd(point_box_id_i, mask_index) - # scatter from local heatmap to global heatmap based on point_xyz voxel units + # scatter from local heatmap to global heatmap based on point_xyz voxel + # units dense_heatmap_i = tf.tensor_scatter_nd_update( tf.zeros(voxel_spatial_size, dtype=heatmap_i.dtype), point_voxel_xyz_i, @@ -278,7 +281,8 @@ def decode_tensor( dims: list of ints., [H, W, Z] Returns: - t_decoded: int32 or int64 decoded tensor of shape [shape, len(dims)], [B, k, 3] + t_decoded: int32 or int64 decoded tensor of shape [shape, len(dims)], + [B, k, 3] """ with tf.name_scope("decode_tensor"): multipliers = [] @@ -313,24 +317,25 @@ def compute_top_k_heatmap_idx(heatmap: tf.Tensor, k: int) -> tf.Tensor: # each index in the range of [0, H*W*Z) _, indices = tf.math.top_k(heatmap_reshape, k=k, sorted=False) # [B, k, 2] or [B, k, 3] - # shape[1:] = [H, W, Z], convert the indices from 1 dimension to 3 dimensions - # in the range of [0, H), [0, W), [0, Z) + # shape[1:] = [H, W, Z], convert the indices from 1 dimension to 3 + # dimensions in the range of [0, H), [0, W), [0, Z) res = decode_tensor(indices, shape[1:]) return res class CenterNetLabelEncoder(keras.layers.Layer): - """Transforms the raw sparse labels into class specific dense training labels. + """Transforms the raw sparse labels into class specific dense training + labels. This layer takes the box locations, box classes and box masks, voxelizes and compute the Gaussian radius for each box, then computes class specific - heatmap for classification and class specific box offset w.r.t to feature map - for regression. + heatmap for classification and class specific box offset w.r.t to feature + map for regression. Args: voxel_size: the x, y, z dimension (in meters) of each voxel. - min_radius: minimum Gasussian radius in each dimension in meters. - max_radius: maximum Gasussian radius in each dimension in meters. + min_radius: minimum Gaussian radius in each dimension in meters. + max_radius: maximum Gaussian radius in each dimension in meters. spatial_size: the x, y, z boundary of voxels num_classes: number of object classes. top_k_heatmap: A sequence of integers, top k for each class. Can be None. @@ -416,7 +421,8 @@ def call(self, inputs): feature_map_ref_xyz = voxel_utils.compute_feature_map_ref_xyz( self._voxel_size, self._spatial_size, global_xyz ) - # convert from global box point xyz to offset w.r.t center of feature map. + # convert from global box point xyz to offset w.r.t center of feature + # map. # [B, H, W, Z, 3] dense_box_3d_center = dense_box_3d[..., :3] - feature_map_ref_xyz # [B, H, W, Z, 7] diff --git a/keras_cv/layers/object_detection_3d/heatmap_decoder.py b/keras_cv/layers/object_detection_3d/heatmap_decoder.py index 1a788b5495..059829f5b0 100644 --- a/keras_cv/layers/object_detection_3d/heatmap_decoder.py +++ b/keras_cv/layers/object_detection_3d/heatmap_decoder.py @@ -30,14 +30,14 @@ def decode_bin_heading(predictions: tf.Tensor, num_bin: int) -> tf.Tensor: and corresponding bin residuals (the following num_bin scores). Args: - predictions: Prediction scores tensor with size [N, num_bin*2] predictions = - [:, bin_1, bin_2, ..., bin_k, res_1, res_2, ..., res_k], where k is the - number of bins and N is the number of boxes + predictions: Prediction scores tensor with size [N, num_bin*2] + predictions = [:, bin_1, bin_2, ..., bin_k, res_1, res_2, ..., res_k], + where k is the number of bins and N is the number of boxes. num_bin: A constant showing the number of bins used in heading bin loss. Returns: - heading: Decoded heading tensor with size [N] in which heading values are in - the [-pi, pi] range. + heading: Decoded heading tensor with size [N] in which heading values are + in the [-pi, pi] range. Raises: ValueError: If the rank of `predictions` is not 2 or `predictions` tensor @@ -46,11 +46,12 @@ def decode_bin_heading(predictions: tf.Tensor, num_bin: int) -> tf.Tensor: with tf.name_scope("decode_bin_heading"): if len(predictions.shape) != 2: raise ValueError( - f"The rank of the prediction tensor is expected to be 2. Instead " - f"it is : {len(predictions.shape)}." + "The rank of the prediction tensor is expected to be 2. " + f"Instead it is : {len(predictions.shape)}." ) - # Get the index of the bin with the maximum score to build a tensor of [N]. + # Get the index of the bin with the maximum score to build a tensor of + # [N]. bin_idx = tf.math.argmax( predictions[:, 0:num_bin], axis=-1, output_type=tf.int32 ) @@ -67,7 +68,8 @@ def decode_bin_heading(predictions: tf.Tensor, num_bin: int) -> tf.Tensor: residual_angle = residual_norm * (angle_per_class / 2) # bin_center is computed using the bin_idx and angle_per class, - # (e.g., 0, 30, 60, 90, 120, ..., 270, 300, 330). Then residual is added. + # (e.g., 0, 30, 60, 90, 120, ..., 270, 300, 330). Then residual is + # added. heading = tf.math.floormod( bin_idx_float * angle_per_class + residual_angle, 2 * np.pi ) @@ -98,14 +100,14 @@ def decode_bin_box(pd, num_head_bin, anchor_size): class HeatmapDecoder(keras.layers.Layer): - """A Keras layer that decodes predictions of an 3d object detection model. + """A Keras layer that decodes predictions of a 3d object detection model. Arg: - class_id: the integer index for a parcitular class. + class_id: the integer index for a particular class. num_head_bin: number of bin classes divided by [-2pi, 2pi]. anchor_size: the size of anchor at each xyz dimension. max_pool_size: the 2d pooling size for heatmap. - max_num_box: top number of boxes selectd from heatmap. + max_num_box: top number of boxes select from heatmap. heatmap_threshold: the threshold to set a heatmap as positive. voxel_size: the x, y, z dimension of each voxel. spatial_size: the x, y, z boundary of voxels. diff --git a/keras_cv/layers/object_detection_3d/voxel_utils.py b/keras_cv/layers/object_detection_3d/voxel_utils.py index 3bbfe0832d..fd054f2918 100644 --- a/keras_cv/layers/object_detection_3d/voxel_utils.py +++ b/keras_cv/layers/object_detection_3d/voxel_utils.py @@ -136,8 +136,8 @@ def point_to_voxel_coord( voxel_size, dtype=point_xyz.dtype ) assert dtype.is_integer or dtype.is_floating, f"{dtype}" - # Note: tf.round casts float to the nearest integer. If the float is 0.5, it - # casts it to the nearest even integer. + # Note: tf.round casts float to the nearest integer. If the float is + # 0.5, it casts it to the nearest even integer. point_voxelized_round = tf.math.round(point_voxelized) if dtype.is_floating: assert dtype == point_xyz.dtype, f"{dtype}" @@ -265,7 +265,8 @@ def _has_rank(tensor, expected_rank): def _pad_or_trim_to(x, shape, pad_val=0, pad_after_contents=True): """Pad and slice x to the given shape. - This is branched from Lingvo https://github.com/tensorflow/lingvo/blob/master/lingvo/core/py_utils.py. + This is branched from Lingvo + https://github.com/tensorflow/lingvo/blob/master/lingvo/core/py_utils.py. Internal usages for keras_cv libraries only. diff --git a/keras_cv/layers/object_detection_3d/voxel_utils_test.py b/keras_cv/layers/object_detection_3d/voxel_utils_test.py index c7cf4b0329..23f2f142af 100644 --- a/keras_cv/layers/object_detection_3d/voxel_utils_test.py +++ b/keras_cv/layers/object_detection_3d/voxel_utils_test.py @@ -18,7 +18,9 @@ class PadOrTrimToTest(tf.test.TestCase): - """Tests for pad_or_trim_to, branched from https://github.com/tensorflow/lingvo/blob/master/lingvo/core/py_utils_test.py.""" + """Tests for pad_or_trim_to, branched from + https://github.com/tensorflow/lingvo/blob/master/lingvo/core/py_utils_test.py. + """ def test_2D_constant_shape_pad(self): x = tf.random.normal(shape=(3, 3), seed=123456) diff --git a/keras_cv/layers/object_detection_3d/voxelization.py b/keras_cv/layers/object_detection_3d/voxelization.py index 8107c4288e..5b14c27354 100644 --- a/keras_cv/layers/object_detection_3d/voxelization.py +++ b/keras_cv/layers/object_detection_3d/voxelization.py @@ -109,8 +109,8 @@ def call( Returns: point_voxel_feature: [B, N, dim] voxel feature (delta_{x,y,z}). - point_voxel_id: [B, N] voxel ID of each point. Invalid voxels have Id's - set to 0. + point_voxel_id: [B, N] voxel ID of each point. Invalid voxels have + Id's set to 0. point_voxel_mask: [B, N] validpoint voxel boolean mask. """ # [B, N, dim] @@ -139,7 +139,7 @@ def call( ) # [B, N] - # remove points outside of the voxel boundary + # remove points outside the voxel boundary point_voxel_mask = tf.logical_and( point_voxel_xyz >= 0, point_voxel_xyz @@ -152,7 +152,8 @@ def call( # [B, N] point_voxel_mask_int = tf.cast(point_voxel_mask, dtype=tf.int32) - # [B, N] for voxel_id, int constant for num_voxels, in the range of [0, B * num_voxels] + # [B, N] for voxel_id, int constant for num_voxels, in the range of + # [0, B * num_voxels] point_voxel_id = compute_point_voxel_id( point_voxel_xyz, self._voxel_spatial_size ) @@ -170,7 +171,8 @@ class DynamicVoxelization(keras.layers.Layer): and max pools all point features inside each voxel. Args: - point_net: a keras Layer that project point feature into another dimension. + point_net: a keras Layer that project point feature into another + dimension. voxel_size: the x, y, z dimension of each voxel. spatial_size: the x, y, z boundary of voxels @@ -228,7 +230,8 @@ def call( point_voxel_id, point_voxel_mask, ) = self._voxelization_layer(point_xyz=point_xyz, point_mask=point_mask) - # TODO(tanzhenyu): move compute_point_voxel_id to here, so PointToVoxel layer is more generic. + # TODO(tanzhenyu): move compute_point_voxel_id to here, so PointToVoxel + # layer is more generic. point_feature = tf.concat([point_feature, point_voxel_feature], axis=-1) batch_size = ( point_feature.shape.as_list()[0] or tf.shape(point_feature)[0] diff --git a/keras_cv/layers/object_detection_3d/voxelization_test.py b/keras_cv/layers/object_detection_3d/voxelization_test.py index b37fbc5812..e7d3eb720e 100644 --- a/keras_cv/layers/object_detection_3d/voxelization_test.py +++ b/keras_cv/layers/object_detection_3d/voxelization_test.py @@ -47,7 +47,8 @@ def test_voxelization_output_shape_no_z(self): ) output = layer(point_xyz, point_feature, point_mask) # (20 - (-20)) / 0.1 = 400, (20 - (-20) ) / 1000 = 0.4 - # the last dimension is replaced with MLP dimension, z dimension is skipped + # the last dimension is replaced with MLP dimension, z dimension is + # skipped self.assertEqual(output.shape, [1, 400, 400, 20]) def test_voxelization_output_shape_with_z(self): @@ -71,7 +72,8 @@ def test_voxelization_output_shape_with_z(self): output = layer(point_xyz, point_feature, point_mask) # (20 - (-20)) / 0.1 = 400, (20 - (-20) ) / 1000 = 0.4 # (15 - (-15)) / 1 = 30 - # the last dimension is replaced with MLP dimension, z dimension is skipped + # the last dimension is replaced with MLP dimension, z dimension is + # skipped self.assertEqual(output.shape, [1, 400, 400, 30, 20]) def test_voxelization_numerical(self): diff --git a/keras_cv/layers/preprocessing/__init__.py b/keras_cv/layers/preprocessing/__init__.py index 4aa8006b6d..1b2d48d722 100644 --- a/keras_cv/layers/preprocessing/__init__.py +++ b/keras_cv/layers/preprocessing/__init__.py @@ -12,8 +12,8 @@ # See the License for the specific language governing permissions and # limitations under the License. -# Also export the image KPLs from core keras, so that user can import all the image -# KPLs from one place. +# Also export the image KPLs from core keras, so that user can import all the +# image KPLs from one place. from tensorflow.keras.layers import CenterCrop from tensorflow.keras.layers import RandomHeight @@ -76,6 +76,6 @@ from keras_cv.layers.preprocessing.rescaling import Rescaling from keras_cv.layers.preprocessing.resizing import Resizing from keras_cv.layers.preprocessing.solarization import Solarization -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) diff --git a/keras_cv/layers/preprocessing/aug_mix.py b/keras_cv/layers/preprocessing/aug_mix.py index f99f39d5a8..b6ed23f156 100644 --- a/keras_cv/layers/preprocessing/aug_mix.py +++ b/keras_cv/layers/preprocessing/aug_mix.py @@ -26,38 +26,39 @@ class AugMix(BaseImageAugmentationLayer): """Performs the AugMix data augmentation technique. - AugMix aims to produce images with variety while preserving the - image semantics and local statistics. During the augmentation process, each image - is augmented `num_chains` different ways, each way consisting of `chain_depth` - augmentations. Augmentations are sampled from the list: translation, shearing, - rotation, posterization, histogram equalization, solarization and auto contrast. - The results of each chain are then mixed together with the original - image based on random samples from a Dirichlet distribution. + AugMix aims to produce images with variety while preserving the image + semantics and local statistics. During the augmentation process, each image + is augmented `num_chains` different ways, each way consisting of + `chain_depth` augmentations. Augmentations are sampled from the list: + translation, shearing, rotation, posterization, histogram equalization, + solarization and auto contrast. The results of each chain are then mixed + together with the original image based on random samples from a Dirichlet + distribution. Args: value_range: the range of values the incoming images will have. Represented as a two number tuple written (low, high). This is typically either `(0, 1)` or `(0, 255)` depending - on how your preprocessing pipeline is setup. - severity: A tuple of two floats, a single float or a `keras_cv.FactorSampler`. - A value is sampled from the provided range. If a float is passed, the - range is interpreted as `(0, severity)`. This value represents the - level of strength of augmentations and is in the range [0, 1]. - Defaults to 0.3. + on how your preprocessing pipeline is set up. + severity: A tuple of two floats, a single float or a + `keras_cv.FactorSampler`. A value is sampled from the provided + range. If a float is passed, the range is interpreted as + `(0, severity)`. This value represents the level of strength of + augmentations and is in the range [0, 1]. Defaults to 0.3. num_chains: an integer representing the number of different chains to - be mixed. Defaults to 3. - chain_depth: an integer or range representing the number of transformations in - the chains. - If a range is passed, a random `chain_depth` value sampled from a uniform distribution over the given range is called at the start of the chain. - Defaults to [1,3]. + be mixed, defaults to 3. + chain_depth: an integer or range representing the number of + transformations in the chains. If a range is passed, a random + `chain_depth` value sampled from a uniform distribution over the + given range is called at the start of the chain. Defaults to [1,3]. alpha: a float value used as the probability coefficients for the - Beta and Dirichlet distributions. Defaults to 1.0. + Beta and Dirichlet distributions, defaults to 1.0. seed: Integer. Used to create a random seed. References: - [AugMix paper](https://arxiv.org/pdf/1912.02781) - [Official Code](https://github.com/google-research/augmix) - - [Unoffial TF Code](https://github.com/szacho/augmix-tf) + - [Unofficial TF Code](https://github.com/szacho/augmix-tf) Sample Usage: ```python diff --git a/keras_cv/layers/preprocessing/auto_contrast.py b/keras_cv/layers/preprocessing/auto_contrast.py index 026fea7499..afbb4667ba 100644 --- a/keras_cv/layers/preprocessing/auto_contrast.py +++ b/keras_cv/layers/preprocessing/auto_contrast.py @@ -15,7 +15,7 @@ import tensorflow as tf from tensorflow import keras -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) from keras_cv.utils import preprocessing @@ -26,9 +26,9 @@ class AutoContrast(VectorizedBaseImageAugmentationLayer): """Performs the AutoContrast operation on an image. Auto contrast stretches the values of an image across the entire available - `value_range`. This makes differences between pixels more obvious. An example of - this is if an image only has values `[0, 1]` out of the range `[0, 255]`, auto - contrast will change the `1` values to be `255`. + `value_range`. This makes differences between pixels more obvious. An + example of this is if an image only has values `[0, 1]` out of the range + `[0, 255]`, auto contrast will change the `1` values to be `255`. Args: value_range: the range of values the incoming images will have. diff --git a/keras_cv/layers/preprocessing/base_image_augmentation_layer.py b/keras_cv/layers/preprocessing/base_image_augmentation_layer.py index 380990ef1c..16c1335163 100644 --- a/keras_cv/layers/preprocessing/base_image_augmentation_layer.py +++ b/keras_cv/layers/preprocessing/base_image_augmentation_layer.py @@ -19,7 +19,7 @@ from keras_cv.utils import preprocessing # In order to support both unbatched and batched inputs, the horizontal -# and verticle axis is reverse indexed +# and vertical axis is reverse indexed H_AXIS = -3 W_AXIS = -2 @@ -35,10 +35,10 @@ @keras.utils.register_keras_serializable(package="keras_cv") class BaseImageAugmentationLayer(keras.__internal__.layers.BaseRandomLayer): - """Abstract base layer for image augmentaion. + """Abstract base layer for image augmentation. This layer contains base functionalities for preprocessing layers which - augment image related data, eg. image and in future, label and bounding + augment image related data, e.g. image and in the future, label and bounding boxes. The subclasses could avoid making certain mistakes and reduce code duplications. @@ -53,10 +53,10 @@ class BaseImageAugmentationLayer(keras.__internal__.layers.BaseRandomLayer): the layer supports that. `get_random_transformation()`, which should produce a random transformation - setting. The transformation object, which could be of any type, will be passed - to `augment_image`, `augment_label` and `augment_bounding_boxes`, to - coordinate the randomness behaviour, e.g., in the RandomFlip layer, the image - and bounding_boxes should be changed in the same way. + setting. The transformation object, which could be of any type, will be + passed to `augment_image`, `augment_label` and `augment_bounding_boxes`, to + coordinate the randomness behaviour, e.g., in the RandomFlip layer, the + image and bounding_boxes should be changed in the same way. The `call()` method supports two formats of inputs: 1. A single image tensor with shape (height, width, channels) or @@ -80,10 +80,10 @@ class BaseImageAugmentationLayer(keras.__internal__.layers.BaseRandomLayer): The `call()` will unpack the inputs, forward to the correct function, and pack the output back to the same structure as the inputs. - By default the `call()` method leverages the `tf.vectorized_map()` function. - Auto-vectorization can be disabled by setting `self.auto_vectorize = False` - in your `__init__()` method. When disabled, `call()` instead relies - on `tf.map_fn()`. For example: + By default, the `call()` method leverages the `tf.vectorized_map()` + function. Auto-vectorization can be disabled by setting + `self.auto_vectorize = False` in your `__init__()` method. When disabled, + `call()` instead relies on `tf.map_fn()`. For example: ```python class SubclassLayer(keras_cv.BaseImageAugmentationLayer): @@ -138,9 +138,9 @@ def force_output_dense_images(self, force_output_dense_images): def auto_vectorize(self): """Control whether automatic vectorization occurs. - By default the `call()` method leverages the `tf.vectorized_map()` - function. Auto-vectorization can be disabled by setting - `self.auto_vectorize = False` in your `__init__()` method. When + By default, the `call()` method leverages the `tf.vectorized_map()` + function. Auto-vectorization can be disabled by setting + `self.auto_vectorize = False` in your `__init__()` method. When disabled, `call()` instead relies on `tf.map_fn()`. For example: ```python @@ -157,11 +157,13 @@ def auto_vectorize(self, auto_vectorize): self._auto_vectorize = auto_vectorize def compute_image_signature(self, images): - """Computes the output image signature for the `augment_image()` function. + """Computes the output image signature for the `augment_image()` + function. - Must be overridden to return tensors with different shapes than the input - images. By default returns either a `tf.RaggedTensorSpec` matching the input - image spec, or a `tf.TensorSpec` matching the input image spec. + Must be overridden to return tensors with different shapes than the + input images. By default, returns either a `tf.RaggedTensorSpec` + matching the input image spec, or a `tf.TensorSpec` matching the input + image spec. """ if self.force_output_dense_images: return tf.TensorSpec(images.shape[1:], self.compute_dtype) @@ -248,7 +250,8 @@ def _any_ragged(inputs): return False def _map_fn(self, func, inputs): - """Returns either tf.map_fn or tf.vectorized_map based on the provided inputs. + """Returns either tf.map_fn or tf.vectorized_map based on the provided + inputs. Args: inputs: dictionary of inputs provided to map_fn. @@ -271,7 +274,8 @@ def augment_image(self, image, transformation, **kwargs): `layer.call()`. transformation: The transformation object produced by `get_random_transformation`. Used to coordinate the randomness - between image, label, bounding box, keypoints, and segmentation mask. + between image, label, bounding box, keypoints, and segmentation + mask. Returns: output 3D tensor, which will be forward to `layer.call()`. @@ -285,7 +289,8 @@ def augment_label(self, label, transformation, **kwargs): label: 1D label to the layer. Forwarded from `layer.call()`. transformation: The transformation object produced by `get_random_transformation`. Used to coordinate the randomness - between image, label, bounding box, keypoints, and segmentation mask. + between image, label, bounding box, keypoints, and segmentation + mask. Returns: output 1D tensor, which will be forward to `layer.call()`. @@ -299,7 +304,8 @@ def augment_target(self, target, transformation, **kwargs): target: 1D label to the layer. Forwarded from `layer.call()`. transformation: The transformation object produced by `get_random_transformation`. Used to coordinate the randomness - between image, label, bounding box, keypoints, and segmentation mask. + between image, label, bounding box, keypoints, and segmentation + mask. Returns: output 1D tensor, which will be forward to `layer.call()`. @@ -310,13 +316,12 @@ def augment_bounding_boxes(self, bounding_boxes, transformation, **kwargs): """Augment bounding boxes for one image during training. Args: - image: 3D image input tensor to the layer. Forwarded from - `layer.call()`. bounding_boxes: 2D bounding boxes to the layer. Forwarded from `call()`. transformation: The transformation object produced by `get_random_transformation`. Used to coordinate the randomness - between image, label, bounding box, keypoints, and segmentation mask. + between image, label, bounding box, keypoints, and segmentation + mask. Returns: output 2D tensor, which will be forward to `layer.call()`. @@ -331,7 +336,8 @@ def augment_keypoints(self, keypoints, transformation, **kwargs): `layer.call()`. transformation: The transformation object produced by `get_random_transformation`. Used to coordinate the randomness - between image, label, bounding box, keypoints, and segmentation mask. + between image, label, bounding box, keypoints, and segmentation + mask. Returns: output 2D tensor, which will be forward to `layer.call()`. @@ -345,14 +351,16 @@ def augment_segmentation_mask( Args: segmentation_mask: 3D segmentation mask input tensor to the layer. - This should generally have the shape [H, W, 1], or in some cases [H, W, C] for multilabeled data. - Forwarded from `layer.call()`. + This should generally have the shape [H, W, 1], or in some cases + [H, W, C] for multilabeled data. Forwarded from `layer.call()`. transformation: The transformation object produced by `get_random_transformation`. Used to coordinate the randomness - between image, label, bounding box, keypoints, and segmentation mask. + between image, label, bounding box, keypoints, and segmentation + mask. Returns: - output 3D tensor containing the augmented segmentation mask, which will be forward to `layer.call()`. + output 3D tensor containing the augmented segmentation mask, which + will be forward to `layer.call()`. """ raise NotImplementedError() @@ -372,7 +380,7 @@ def get_random_transformation( Args: image: 3D image tensor from inputs. label: optional 1D label tensor from inputs. - bounding_box: optional 2D bounding boxes tensor from inputs. + bounding_boxes: optional 2D bounding boxes tensor from inputs. segmentation_mask: optional 3D segmentation mask tensor from inputs. Returns: @@ -406,9 +414,9 @@ def _augment(self, inputs): segmentation_mask = inputs.get(SEGMENTATION_MASKS, None) image_ragged = isinstance(image, tf.RaggedTensor) - # At this point, the tensor is not actually ragged as we have mapped over the - # batch axis. This call is required to make `tf.shape()` behave as users - # subclassing the layer expect. + # At this point, the tensor is not actually ragged as we have mapped + # over the batch axis. This call is required to make `tf.shape()` behave + # as users subclassing the layer expect. if image_ragged: image = image.to_tensor() @@ -489,7 +497,8 @@ def _format_inputs(self, inputs): if not isinstance(inputs, dict): raise ValueError( - f"Expect the inputs to be image tensor or dict. Got inputs={inputs}" + "Expect the inputs to be image tensor or dict. Got " + f"inputs={inputs}" ) if BOUNDING_BOXES in inputs: @@ -507,13 +516,14 @@ def _format_inputs(self, inputs): return inputs, metadata def _format_bounding_boxes(self, bounding_boxes): - # We can't catch the case where this is None, sometimes RaggedTensor drops this - # dimension + # We can't catch the case where this is None, sometimes RaggedTensor + # drops this dimension if "classes" not in bounding_boxes: raise ValueError( - "Bounding boxes are missing class_id. If you would like to pad the " - "bounding boxes with class_id, use: " - "`bounding_boxes['classes'] = tf.ones_like(bounding_boxes['boxes'])`." + "Bounding boxes are missing class_id. If you would like to pad " + "the bounding boxes with class_id, use: " + "`bounding_boxes['classes'] = " + "tf.ones_like(bounding_boxes['boxes'])`." ) return bounding_boxes diff --git a/keras_cv/layers/preprocessing/channel_shuffle.py b/keras_cv/layers/preprocessing/channel_shuffle.py index 2bf84f9e48..e110ebaa19 100644 --- a/keras_cv/layers/preprocessing/channel_shuffle.py +++ b/keras_cv/layers/preprocessing/channel_shuffle.py @@ -15,7 +15,7 @@ import tensorflow as tf from tensorflow import keras -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) @@ -32,7 +32,7 @@ class ChannelShuffle(VectorizedBaseImageAugmentationLayer): `(..., height, width, channels)`, in `"channels_last"` format Args: - groups: Number of groups to divide the input channels. Default 3. + groups: Number of groups to divide the input channels, defaults to 3. seed: Integer. Used to create a random seed. Usage: diff --git a/keras_cv/layers/preprocessing/cut_mix.py b/keras_cv/layers/preprocessing/cut_mix.py index 90cf99f1e2..dc939f25dc 100644 --- a/keras_cv/layers/preprocessing/cut_mix.py +++ b/keras_cv/layers/preprocessing/cut_mix.py @@ -27,9 +27,9 @@ class CutMix(BaseImageAugmentationLayer): Args: alpha: Float between 0 and 1. Inverse scale parameter for the gamma - distribution. This controls the shape of the distribution from which the - smoothing values are sampled. Defaults to 1.0, which is a recommended value - when training an imagenet1k classification model. + distribution. This controls the shape of the distribution from which + the smoothing values are sampled. Defaults to 1.0, which is a + recommended value when training an imagenet1k classification model. seed: Integer. Used to create a random seed. References: - [CutMix paper]( https://arxiv.org/abs/1905.04899). diff --git a/keras_cv/layers/preprocessing/equalization.py b/keras_cv/layers/preprocessing/equalization.py index 0b86e8c8fb..95ea113e1d 100644 --- a/keras_cv/layers/preprocessing/equalization.py +++ b/keras_cv/layers/preprocessing/equalization.py @@ -26,12 +26,12 @@ class Equalization(BaseImageAugmentationLayer): """Equalization performs histogram equalization on a channel-wise basis. Args: - value_range: a tuple or a list of two elements. The first value represents - the lower bound for values in passed images, the second represents the - upper bound. Images passed to the layer should have values within - `value_range`. - bins: Integer indicating the number of bins to use in histogram equalization. - Should be in the range [0, 256]. + value_range: a tuple or a list of two elements. The first value + represents the lower bound for values in passed images, the second + represents the upper bound. Images passed to the layer should have + values within `value_range`. + bins: Integer indicating the number of bins to use in histogram + equalization. Should be in the range [0, 256]. Usage: ```python @@ -43,8 +43,8 @@ class Equalization(BaseImageAugmentationLayer): ``` Call arguments: - images: Tensor of pixels in range [0, 255], in RGB format. Can be - of type float or int. Should be in NHWC format. + images: Tensor of pixels in range [0, 255], in RGB format. Can be + of type float or int. Should be in NHWC format. """ def __init__(self, value_range, bins=256, **kwargs): @@ -64,9 +64,10 @@ def equalize_channel(self, image, channel_index): # Compute the histogram of the image channel. histogram = tf.histogram_fixed_width(image, [0, 255], nbins=self.bins) - # For the purposes of computing the step, filter out the nonzeros. - # Zeroes are replaced by a big number while calculating min to keep shape - # constant across input sizes for compatibility with vectorized_map + # For the purposes of computing the step, filter out the non-zeros. + # Zeroes are replaced by a big number while calculating min to keep + # shape constant across input sizes for compatibility with + # vectorized_map big_number = 1410065408 histogram_without_zeroes = tf.where( @@ -85,11 +86,11 @@ def build_mapping(histogram, step): lookup_table = (tf.cumsum(histogram) + (step // 2)) // step # Shift lookup_table, prepending with 0. lookup_table = tf.concat([[0], lookup_table[:-1]], 0) - # Clip the counts to be in range. This is done + # Clip the counts to be in range. This is done # in the C code for image.point. return tf.clip_by_value(lookup_table, 0, 255) - # If step is zero, return the original image. Otherwise, build + # If step is zero, return the original image. Otherwise, build # lookup table from the full histogram and step and then index from it. result = tf.cond( tf.equal(step, 0), diff --git a/keras_cv/layers/preprocessing/fourier_mix.py b/keras_cv/layers/preprocessing/fourier_mix.py index 0115ef0d2c..2089967f19 100644 --- a/keras_cv/layers/preprocessing/fourier_mix.py +++ b/keras_cv/layers/preprocessing/fourier_mix.py @@ -25,12 +25,12 @@ class FourierMix(BaseImageAugmentationLayer): """FourierMix implements the FMix data augmentation technique. Args: - alpha: Float value for beta distribution. Inverse scale parameter for the gamma - distribution. This controls the shape of the distribution from which the - smoothing values are sampled. Defaults to 0.5, which is a recommended value - in the paper. - decay_power: A float value representing the decay power. Defaults to 3, as - recommended in the paper. + alpha: Float value for beta distribution. Inverse scale parameter for + the gamma distribution. This controls the shape of the distribution + from which the smoothing values are sampled. Defaults to 0.5, which + is a recommended value in the paper. + decay_power: A float value representing the decay power, defaults to 3, + as recommended in the paper. seed: Integer. Used to create a random seed. References: - [FMix paper](https://arxiv.org/abs/2002.12047). @@ -39,7 +39,9 @@ class FourierMix(BaseImageAugmentationLayer): ```python (images, labels), _ = keras.datasets.cifar10.load_data() fourier_mix = keras_cv.layers.preprocessing.FourierMix(0.5) - augmented_images, updated_labels = fourier_mix({'images': images, 'labels': labels}) + augmented_images, updated_labels = fourier_mix( + {'images': images, 'labels': labels} + ) # output == {'images': updated_images, 'labels': updated_labels} ``` """ @@ -61,9 +63,9 @@ def _sample_from_beta(self, alpha, beta, shape): @staticmethod def _fftfreq(signal_size, sample_spacing=1): - """This function returns the sample frequencies of a discrete fourier transform. - The result array contains the frequency bin centers starting at 0 using the - sample spacing. + """This function returns the sample frequencies of a discrete fourier + transform. The result array contains the frequency bin centers starting + at 0 using the sample spacing. """ results = tf.concat( @@ -85,7 +87,8 @@ def _apply_fftfreq(self, h, w): return tf.math.sqrt(fx * fx + fy * fy) def _get_spectrum(self, freqs, decay_power, channel, h, w): - # Function to apply a low pass filter by decaying its high frequency components. + # Function to apply a low pass filter by decaying its high frequency + # components. scale = tf.ones(1) / tf.cast( tf.math.maximum( freqs, tf.convert_to_tensor([1 / tf.reduce_max([w, h])]) @@ -159,9 +162,9 @@ def _batch_augment(self, inputs): def _augment(self, inputs): raise ValueError( - "FourierMix received a single image to `call`. The layer relies on " + "FourierMix received a single image to `call`. The layer relies on " "combining multiple examples, and as such will not behave as " - "expected. Please call the layer with 2 or more samples." + "expected. Please call the layer with 2 or more samples." ) def _fourier_mix(self, images): diff --git a/keras_cv/layers/preprocessing/grayscale.py b/keras_cv/layers/preprocessing/grayscale.py index f32811037b..4be0dff405 100644 --- a/keras_cv/layers/preprocessing/grayscale.py +++ b/keras_cv/layers/preprocessing/grayscale.py @@ -15,14 +15,15 @@ import tensorflow as tf from tensorflow import keras -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) @keras.utils.register_keras_serializable(package="keras_cv") class Grayscale(VectorizedBaseImageAugmentationLayer): - """Grayscale is a preprocessing layer that transforms RGB images to Grayscale images. + """Grayscale is a preprocessing layer that transforms RGB images to + Grayscale images. Input images should have values in the range of [0, 255]. Input shape: diff --git a/keras_cv/layers/preprocessing/grid_mask.py b/keras_cv/layers/preprocessing/grid_mask.py index 02e184fe40..6253df1932 100644 --- a/keras_cv/layers/preprocessing/grid_mask.py +++ b/keras_cv/layers/preprocessing/grid_mask.py @@ -52,31 +52,34 @@ class GridMask(BaseImageAugmentationLayer): Ratio determines the ratio from spacings to grid masks. Lower values make the grid size smaller, and higher values make the grid mask large. - Floats should be in the range [0, 1]. 0.5 indicates that grid and + Floats should be in the range [0, 1]. 0.5 indicates that grid and spacing will be of equal size. To always use the same value, pass a `keras_cv.ConstantFactorSampler()`. Defaults to `(0, 0.5)`. rotation_factor: - The rotation_factor will be used to randomly rotate the grid_mask during - training. Default to 0.1, which results in an output rotating by a - random amount in the range [-10% * 2pi, 10% * 2pi]. + The rotation_factor will be used to randomly rotate the grid_mask + during training. Default to 0.1, which results in an output rotating + by a random amount in the range [-10% * 2pi, 10% * 2pi]. A float represented as fraction of 2 Pi, or a tuple of size 2 representing lower and upper bound for rotating clockwise and - counter-clockwise. A positive values means rotating counter clock-wise, - while a negative value means clock-wise. When represented as a single - float, this value is used for both the upper and lower bound. For - instance, factor=(-0.2, 0.3) results in an output rotation by a random - amount in the range [-20% * 2pi, 30% * 2pi]. factor=0.2 results in an - output rotating by a random amount in the range [-20% * 2pi, 20% * 2pi]. + counter-clockwise. A positive values means rotating counter + clock-wise, while a negative value means clock-wise. When + represented as a single float, this value is used for both the upper + and lower bound. For instance, factor=(-0.2, 0.3) results in an + output rotation by a random amount in the range [-20% * 2pi, + 30% * 2pi]. factor=0.2 results in an output rotating by a random + amount in the range [-20% * 2pi, 20% * 2pi]. fill_mode: Pixels inside the gridblock are filled according to the given - mode (one of `{"constant", "gaussian_noise"}`). Default: "constant". + mode (one of `{"constant", "gaussian_noise"}`), defaults to + "constant". - *constant*: Pixels are filled with the same constant value. - *gaussian_noise*: Pixels are filled with random gaussian noise. - fill_value: an integer represents of value to be filled inside the gridblock - when `fill_mode="constant"`. Valid integer range [0 to 255] + fill_value: an integer represents of value to be filled inside the + gridblock when `fill_mode="constant"`. Valid integer range + [0 to 255] seed: Integer. Used to create a random seed. Usage: @@ -107,7 +110,7 @@ def __init__( if isinstance(rotation_factor, core.FactorSampler): raise ValueError( "Currently `GridMask.rotation_factor` does not support the " - "`FactorSampler` API. This will be supported in the next Keras " + "`FactorSampler` API. This will be supported in the next Keras " "release. For now, please pass a float for the " "`rotation_factor` argument." ) @@ -136,7 +139,7 @@ def _check_parameter_values(self): if fill_mode not in ["constant", "gaussian_noise", "random"]: raise ValueError( '`fill_mode` should be "constant", ' - f'"gaussian_noise", or "random". Got `fill_mode`={fill_mode}' + f'"gaussian_noise", or "random". Got `fill_mode`={fill_mode}' ) def get_random_transformation( diff --git a/keras_cv/layers/preprocessing/jittered_resize.py b/keras_cv/layers/preprocessing/jittered_resize.py index f3584173df..55671b117f 100644 --- a/keras_cv/layers/preprocessing/jittered_resize.py +++ b/keras_cv/layers/preprocessing/jittered_resize.py @@ -29,21 +29,22 @@ class JitteredResize(BaseImageAugmentationLayer): """JitteredResize implements resize with scale distortion. - JitteredResize takes a three step approach to size-distortion based image - augmentation. This technique is specifically tuned for object detection pipelines. - The layer takes an input of images and bounding boxes, both of which may be ragged. - It outputs a dense image tensor, ready to feed to a model for training. - As such this layer will commonly be the final step in an augmentation - pipeline. + JitteredResize takes a three-step approach to size-distortion based image + augmentation. This technique is specifically tuned for object detection + pipelines. The layer takes an input of images and bounding boxes, both of + which may be ragged. It outputs a dense image tensor, ready to feed to a + model for training. As such this layer will commonly be the final step in an + augmentation pipeline. The augmentation process is as follows: - The image is first scaled according to a randomly sampled scale factor. The width - and height of the image are then resized according to the sampled scale. This is - done to introduce noise into the local scale of features in the image. A subset of - the image is then cropped randomly according to `crop_size`. This crop is then - padded to be `target_size`. Bounding boxes are translated and scaled according to - the random scaling and random cropping. + The image is first scaled according to a randomly sampled scale factor. The + width and height of the image are then resized according to the sampled + scale. This is done to introduce noise into the local scale of features in + the image. A subset of the image is then cropped randomly according to + `crop_size`. This crop is then padded to be `target_size`. Bounding boxes + are translated and scaled according to the random scaling and random + cropping. Usage: ```python @@ -53,7 +54,9 @@ class JitteredResize(BaseImageAugmentationLayer): scale_factor=(0.8, 1.25), bounding_box_format="xywh", ) - train_ds = train_ds.map(jittered_resize, num_parallel_calls=tf.data.AUTOTUNE) + train_ds = train_ds.map( + jittered_resize, num_parallel_calls=tf.data.AUTOTUNE + ) # images now are (640, 640, 3) # an example using crop size @@ -64,24 +67,28 @@ class JitteredResize(BaseImageAugmentationLayer): scale_factor=(0.8, 1.25), bounding_box_format="xywh", ) - train_ds = train_ds.map(jittered_resize, num_parallel_calls=tf.data.AUTOTUNE) + train_ds = train_ds.map( + jittered_resize, num_parallel_calls=tf.data.AUTOTUNE + ) # images now are (640, 640, 3), but they were resized from a 250x250 crop. ``` Args: - target_size: A tuple repesenting the output size of images. - scale_factor: A tuple of two floats or a `keras_cv.FactorSampler`. For each - augmented image a value is sampled from the provided range. + target_size: A tuple representing the output size of images. + scale_factor: A tuple of two floats or a `keras_cv.FactorSampler`. For + each augmented image a value is sampled from the provided range. This factor is used to scale the input image. To replicate the results of the MaskRCNN paper pass `(0.8, 1.25)`. - crop_size: (Optional) the size of the image to crop from the scaled image. - Defaults to `target_size` when not provided. - bounding_box_format: The format of bounding boxes of input boxes. Refer - to https://github.com/keras-team/keras-cv/blob/master/keras_cv/bounding_box/converters.py + crop_size: (Optional) the size of the image to crop from the scaled + image, defaults to `target_size` when not provided. + bounding_box_format: The format of bounding boxes of input boxes. + Refer to + https://github.com/keras-team/keras-cv/blob/master/keras_cv/bounding_box/converters.py for more details on supported bounding box formats. - interpolation: String, the interpolation method. Defaults to `"bilinear"`. - Supports `"bilinear"`, `"nearest"`, `"bicubic"`, `"area"`, `"lanczos3"`, - `"lanczos5"`, `"gaussian"`, `"mitchellcubic"`. + interpolation: String, the interpolation method, defaults to + `"bilinear"`. Supports `"bilinear"`, `"nearest"`, `"bicubic"`, + `"area"`, `"lanczos3"`, `"lanczos5"`, `"gaussian"`, + `"mitchellcubic"`. seed: (Optional) integer to use as the random seed. """ @@ -98,8 +105,8 @@ def __init__( super().__init__(**kwargs) if not isinstance(target_size, tuple) or len(target_size) != 2: raise ValueError( - "JitteredResize() expects `target_size` to be " - f"a tuple of two integers. Received `target_size={target_size}`" + "JitteredResize() expects `target_size` to be a tuple of two " + f"integers. Received `target_size={target_size}`" ) crop_size = crop_size or target_size diff --git a/keras_cv/layers/preprocessing/maybe_apply.py b/keras_cv/layers/preprocessing/maybe_apply.py index cc6c44291c..6fb322411c 100644 --- a/keras_cv/layers/preprocessing/maybe_apply.py +++ b/keras_cv/layers/preprocessing/maybe_apply.py @@ -24,20 +24,20 @@ class MaybeApply(BaseImageAugmentationLayer): """Apply provided layer to random elements in a batch. Args: - layer: a keras `Layer` or `BaseImageAugmentationLayer`. This layer will be - applied to randomly chosen samples in a batch. Layer should not modify the - size of provided inputs. - rate: controls the frequency of applying the layer. 1.0 means all elements in - a batch will be modified. 0.0 means no elements will be modified. - Defaults to 0.5. - batchwise: (Optional) bool, whether or not to pass entire batches to the - underlying layer. When set to true, only a single random sample is + layer: a keras `Layer` or `BaseImageAugmentationLayer`. This layer will + be applied to randomly chosen samples in a batch. Layer should not + modify the size of provided inputs. + rate: controls the frequency of applying the layer. 1.0 means all + elements in a batch will be modified. 0.0 means no elements will be + modified. Defaults to 0.5. + batchwise: (Optional) bool, whether to pass entire batches to the + underlying layer. When set to true, only a single random sample is drawn to determine if the batch should be passed to the underlying - layer. This is useful when using `MixUp()`, `CutMix()`, `Mosaic()`, + layer. This is useful when using `MixUp()`, `CutMix()`, `Mosaic()`, etc. auto_vectorize: bool, whether to use tf.vectorized_map or tf.map_fn for - batched input. Setting this to True might give better performance but - currently doesn't work with XLA. Defaults to False. + batched input. Setting this to True might give better performance + but currently doesn't work with XLA. Defaults to False. seed: integer, controls random behaviour. Example usage: @@ -84,7 +84,8 @@ class MaybeApply(BaseImageAugmentationLayer): # [[0. , 0. ], # [0. , 0. ]]], dtype=float32)> - # We can observe that the layer has been randomly applied to 2 out of 5 samples. + # We can observe that the layer has been randomly applied to 2 out of 5 + samples. ``` """ diff --git a/keras_cv/layers/preprocessing/mix_up.py b/keras_cv/layers/preprocessing/mix_up.py index 447e58cf55..2574438906 100644 --- a/keras_cv/layers/preprocessing/mix_up.py +++ b/keras_cv/layers/preprocessing/mix_up.py @@ -27,9 +27,9 @@ class MixUp(BaseImageAugmentationLayer): Args: alpha: Float between 0 and 1. Inverse scale parameter for the gamma - distribution. This controls the shape of the distribution from which the - smoothing values are sampled. Defaults to 0.2, which is a recommended value - when training an imagenet1k classification model. + distribution. This controls the shape of the distribution from which + the smoothing values are sampled. Defaults to 0.2, which is a + recommended value when training an imagenet1k classification model. seed: Integer. Used to create a random seed. References: @@ -43,7 +43,9 @@ class MixUp(BaseImageAugmentationLayer): # Labels must be floating-point and one-hot encoded labels = tf.cast(tf.one_hot(labels, 10), tf.float32) mixup = keras_cv.layers.preprocessing.MixUp(10) - augmented_images, updated_labels = mixup({'images': images, 'labels': labels}) + augmented_images, updated_labels = mixup( + {'images': images, 'labels': labels} + ) # output == {'images': updated_images, 'labels': updated_labels} ``` """ diff --git a/keras_cv/layers/preprocessing/mosaic.py b/keras_cv/layers/preprocessing/mosaic.py index 22a3e68d66..c89a864f21 100644 --- a/keras_cv/layers/preprocessing/mosaic.py +++ b/keras_cv/layers/preprocessing/mosaic.py @@ -16,19 +16,19 @@ from tensorflow import keras from keras_cv import bounding_box -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 BATCHED, ) -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 BOUNDING_BOXES, ) -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 IMAGES, ) -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 LABELS, ) -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) from keras_cv.utils import preprocessing as preprocessing_utils @@ -40,9 +40,9 @@ class Mosaic(VectorizedBaseImageAugmentationLayer): Mosaic data augmentation first takes 4 images from the batch and makes a grid. After that based on the offset, a crop is taken to form the mosaic - image. Labels are in the same ratio as the the area of their images in the - output image. Bounding boxes are translated according to the position of - the 4 images. + image. Labels are in the same ratio as the area of their images in the + output image. Bounding boxes are translated according to the position of the + 4 images. Args: offset: A tuple of two floats, a single float or @@ -55,14 +55,14 @@ class Mosaic(VectorizedBaseImageAugmentationLayer): pass a tuple with two identical floats: `(0.5, 0.5)`. Defaults to (0.25, 0.75). bounding_box_format: a case-insensitive string (for example, "xyxy") to - be passed if bounding boxes are being augmented by this layer. - Each bounding box is defined by at least these 4 values. The inputs - may contain additional information such as classes and confidence - after these 4 values but these values will be ignored and returned - as is. For detailed information on the supported formats, see the + be passed if bounding boxes are being augmented by this layer. Each + bounding box is defined by at least these 4 values. The inputs may + contain additional information such as classes and confidence after + these 4 values but these values will be ignored and returned as is. + For detailed information on the supported formats, see the [KerasCV bounding box documentation](https://keras.io/api/keras_cv/bounding_box/formats/). - Defualts to None. - seed: Integer. Used to create a random seed. + Defaults to None. + seed: integer, used to create a random seed. References: - [Yolov4 paper](https://arxiv.org/pdf/2004.10934). @@ -78,7 +78,7 @@ class Mosaic(VectorizedBaseImageAugmentationLayer): output = mosaic({'images': images, 'labels': labels}) # output == {'images': updated_images, 'labels': updated_labels} ``` - """ + """ # noqa: E501 def __init__( self, offset=(0.25, 0.75), bounding_box_format=None, seed=None, **kwargs @@ -231,7 +231,6 @@ def augment_bounding_boxes( ], axis=-1, ) - # updates bounding_boxes for one output mosaic permutation_order = transformations["permutation_order"] classes_for_mosaic = tf.gather(classes, permutation_order) diff --git a/keras_cv/layers/preprocessing/posterization.py b/keras_cv/layers/preprocessing/posterization.py index 1a3d22d482..131c66caec 100644 --- a/keras_cv/layers/preprocessing/posterization.py +++ b/keras_cv/layers/preprocessing/posterization.py @@ -26,39 +26,37 @@ class Posterization(BaseImageAugmentationLayer): """Reduces the number of bits for each color channel. References: - - [AutoAugment: Learning Augmentation Policies from Data]( - https://arxiv.org/abs/1805.09501 - ) - - [RandAugment: Practical automated data augmentation with a reduced search space]( - https://arxiv.org/abs/1909.13719 - ) + - [AutoAugment: Learning Augmentation Policies from Data](https://arxiv.org/abs/1805.09501) + - [RandAugment: Practical automated data augmentation with a reduced search space](https://arxiv.org/abs/1909.13719) Args: - value_range: a tuple or a list of two elements. The first value represents - the lower bound for values in passed images, the second represents the - upper bound. Images passed to the layer should have values within - `value_range`. Defaults to `(0, 255)`. - bits: integer. The number of bits to keep for each channel. Must be a value - between 1-8. + value_range: a tuple or a list of two elements. The first value + represents the lower bound for values in passed images, the second + represents the upper bound. Images passed to the layer should have + values within `value_range`. Defaults to `(0, 255)`. + bits: integer, the number of bits to keep for each channel. Must be a + value between 1-8. Usage: ```python (images, labels), _ = keras.datasets.cifar10.load_data() print(images[0, 0, 0]) # [59 62 63] - # Note that images are Tensors with values in the range [0, 255] and uint8 dtype + # Note that images are Tensors with values in the range [0, 255] and uint8 + dtype posterization = Posterization(bits=4, value_range=[0, 255]) images = posterization(images) print(images[0, 0, 0]) # [48., 48., 48.] - # NOTE: the layer will output values in tf.float32, regardless of input dtype. + # NOTE: the layer will output values in tf.float32, regardless of input + dtype. ``` Call arguments: inputs: input tensor in two possible formats: 1. single 3D (HWC) image or 4D (NHWC) batch of images. 2. A dict of tensors where the images are under `"images"` key. - """ + """ # noqa: E501 def __init__(self, value_range, bits, **kwargs): super().__init__(**kwargs) @@ -104,8 +102,8 @@ def augment_segmentation_mask( return segmentation_mask def _batch_augment(self, inputs): - # Skip the use of vectorized_map or map_fn as the implementation is already - # vectorized + # Skip the use of vectorized_map or map_fn as the implementation is + # already vectorized return self._augment(inputs) def _posterize(self, image): diff --git a/keras_cv/layers/preprocessing/posterization_test.py b/keras_cv/layers/preprocessing/posterization_test.py index eab153f188..76140f9054 100644 --- a/keras_cv/layers/preprocessing/posterization_test.py +++ b/keras_cv/layers/preprocessing/posterization_test.py @@ -88,8 +88,8 @@ def _calc_expected_output(image, bits): """Posterization in numpy, based on Albumentations: The algorithm is basically: - 1. create a lookup table of all possible input pixel values to pixel values - after posterize + 1. create a lookup table of all possible input pixel values to pixel + values after posterize 2. map each pixel in the input to created lookup table. Source: diff --git a/keras_cv/layers/preprocessing/rand_augment.py b/keras_cv/layers/preprocessing/rand_augment.py index 3a81794adb..45009e5806 100644 --- a/keras_cv/layers/preprocessing/rand_augment.py +++ b/keras_cv/layers/preprocessing/rand_augment.py @@ -26,9 +26,9 @@ class RandAugment(RandomAugmentationPipeline): """RandAugment performs the Rand Augment operation on input images. - This layer can be thought of as an all in one image augmentation layer. The policy - implemented by this layer has been benchmarked extensively and is effective on a - wide variety of datasets. + This layer can be thought of as an all-in-one image augmentation layer. The + policy implemented by this layer has been benchmarked extensively and is + effective on a wide variety of datasets. The policy operates as follows: @@ -44,28 +44,30 @@ class RandAugment(RandomAugmentationPipeline): value_range: the range of values the incoming images will have. Represented as a two number tuple written [low, high]. This is typically either `[0, 1]` or `[0, 255]` depending - on how your preprocessing pipeline is setup. - augmentations_per_image: the number of layers to use in the rand augment policy. - Defaults to `3`. - magnitude: magnitude is the mean of the normal distribution used to sample the - magnitude used for each data augmentation. magnitude should - be a float in the range `[0, 1]`. A magnitude of `0` indicates that the - augmentations are as weak as possible (not recommended), while a value of - `1.0` implies use of the strongest possible augmentation. All magnitudes - are clipped to the range `[0, 1]` after sampling. Defaults to `0.5`. - magnitude_stddev: the standard deviation to use when drawing values - for the perturbations. Keep in mind magnitude will still be clipped to the - range `[0, 1]` after samples are drawn from the normal distribution. - Defaults to `0.15`. - rate: the rate at which to apply each augmentation. This parameter is applied - on a per-distortion layer, per image. Should be in the range `[0, 1]`. - To reproduce the original RandAugment paper results, set this to `10/11`. - The original `RandAugment` paper includes an Identity transform. By setting - the rate to 10/11 in our implementation, the behavior is identical to - sampling an Identity augmentation 10/11th of the time. - Defaults to `1.0`. - geometric: whether or not to include geometric augmentations. This should be - set to False when performing object detection. Defaults to True. + on how your preprocessing pipeline is set up. + augmentations_per_image: the number of layers to use in the rand augment + policy, defaults to `3`. + magnitude: magnitude is the mean of the normal distribution used to + sample the magnitude used for each data augmentation. Magnitude + should be a float in the range `[0, 1]`. A magnitude of `0` + indicates that the augmentations are as weak as possible (not + recommended), while a value of `1.0` implies use of the strongest + possible augmentation. All magnitudes are clipped to the range + `[0, 1]` after sampling. Defaults to `0.5`. + magnitude_stddev: the standard deviation to use when drawing values for + the perturbations. Keep in mind magnitude will still be clipped to + the range `[0, 1]` after samples are drawn from the normal + distribution. Defaults to `0.15`. + rate: the rate at which to apply each augmentation. This parameter is + applied on a per-distortion layer, per image. Should be in the range + `[0, 1]`. To reproduce the original RandAugment paper results, set + this to `10/11`. The original `RandAugment` paper includes an + Identity transform. By setting the rate to 10/11 in our + implementation, the behavior is identical to sampling an Identity + augmentation 10/11th of the time. Defaults to `1.0`. + geometric: whether to include geometric augmentations. This + should be set to False when performing object detection. Defaults to + True. Usage: ```python (x_test, y_test), _ = keras.datasets.cifar10.load_data() @@ -87,11 +89,12 @@ def __init__( seed=None, **kwargs, ): - # As an optimization RandAugment makes all internal layers use (0, 255) while + # As an optimization RandAugment makes all internal layers use (0, 255) # and we handle range transformation at the _augment level. if magnitude < 0.0 or magnitude > 1: raise ValueError( - f"`magnitude` must be in the range [0, 1], got `magnitude={magnitude}`" + "`magnitude` must be in the range [0, 1], got " + f"`magnitude={magnitude}`" ) if magnitude_stddev < 0.0 or magnitude_stddev > 1: raise ValueError( @@ -209,7 +212,8 @@ def equalize_policy(magnitude, magnitude_stddev): def solarize_policy(magnitude, magnitude_stddev): # We cap additions at 110, because if we add more than 110 we will be nearly - # nullifying the information contained in the image, making the model train on noise + # nullifying the information contained in the image, making the model train + # on noise maximum_addition_value = 110 addition_factor = core.NormalFactorSampler( mean=magnitude * maximum_addition_value, diff --git a/keras_cv/layers/preprocessing/random_aspect_ratio.py b/keras_cv/layers/preprocessing/random_aspect_ratio.py index 3660e32f09..bf889cc85f 100644 --- a/keras_cv/layers/preprocessing/random_aspect_ratio.py +++ b/keras_cv/layers/preprocessing/random_aspect_ratio.py @@ -24,14 +24,15 @@ @keras.utils.register_keras_serializable(package="keras_cv") class RandomAspectRatio(BaseImageAugmentationLayer): - """RandomAspectRatio randomly distorts the aspect ratio of the provided image. + """RandomAspectRatio randomly distorts the aspect ratio of the provided + image. - This is done on an element-wise basis, and as a consequence this layer always - returns a tf.RaggedTensor. + This is done on an element-wise basis, and as a consequence this layer + always returns a tf.RaggedTensor. Args: - factor: a range of values in the range `(0, infinity)` that determines the - percentage to distort the aspect ratio of each image by. + factor: a range of values in the range `(0, infinity)` that determines + the percentage to distort the aspect ratio of each image by. interpolation: interpolation method used in the `Resize` op. Supported values are `"nearest"` and `"bilinear"`. Defaults to `"bilinear"`. diff --git a/keras_cv/layers/preprocessing/random_augmentation_pipeline.py b/keras_cv/layers/preprocessing/random_augmentation_pipeline.py index 3eaaf0f12b..e5db1b9e4a 100644 --- a/keras_cv/layers/preprocessing/random_augmentation_pipeline.py +++ b/keras_cv/layers/preprocessing/random_augmentation_pipeline.py @@ -23,12 +23,14 @@ @keras.utils.register_keras_serializable(package="keras_cv") class RandomAugmentationPipeline(BaseImageAugmentationLayer): - """RandomAugmentationPipeline constructs a pipeline based on provided arguments. + """RandomAugmentationPipeline constructs a pipeline based on provided + arguments. - The implemented policy does the following: for each inputs provided in `call`(), the - policy first inputs a random number, if the number is < rate, the policy then - selects a random layer from the provided list of `layers`. It then calls the - `layer()` on the inputs. This is done `augmentations_per_image` times. + The implemented policy does the following: for each input provided in + `call`(), the policy first inputs a random number, if the number is < rate, + the policy then selects a random layer from the provided list of `layers`. + It then calls the `layer()` on the inputs. This is done + `augmentations_per_image` times. This layer can be used to create custom policies resembling `RandAugment` or `AutoAugment`. @@ -39,7 +41,8 @@ class RandomAugmentationPipeline(BaseImageAugmentationLayer): layers = keras_cv.layers.RandAugment.get_standard_policy( value_range=(0, 255), magnitude=0.75, magnitude_stddev=0.3 ) - layers = layers[:4] # slice out some layers you don't want for whatever reason + layers = layers[:4] # slice out some layers you don't want for whatever + reason layers = layers + [keras_cv.layers.GridMask()] # create the pipeline. @@ -51,19 +54,20 @@ class RandomAugmentationPipeline(BaseImageAugmentationLayer): ``` Args: - layers: a list of `keras.Layers`. These are randomly inputs during - augmentation to augment the inputs passed in `call()`. The layers passed - should subclass `BaseImageAugmentationLayer`. Passing `layers=[]` - would result in a no-op. - augmentations_per_image: the number of layers to apply to each inputs in the - `call()` method. - rate: the rate at which to apply each augmentation. This is applied on a per - augmentation bases, so if `augmentations_per_image=3` and `rate=0.5`, the - odds an image will receive no augmentations is 0.5^3, or 0.5*0.5*0.5. + layers: a list of `keras.Layers`. These are randomly inputs during + augmentation to augment the inputs passed in `call()`. The layers + passed should subclass `BaseImageAugmentationLayer`. Passing + `layers=[]` would result in a no-op. + augmentations_per_image: the number of layers to apply to each inputs in + the `call()` method. + rate: the rate at which to apply each augmentation. This is applied on a + per augmentation bases, so if `augmentations_per_image=3` and + `rate=0.5`, the odds an image will receive no augmentations is + 0.5^3, or 0.5*0.5*0.5. auto_vectorize: whether to use `tf.vectorized_map` or `tf.map_fn` to - apply the augmentations. This offers a significant performance boost, but - can only be used if all the layers provided to the `layers` argument - support auto vectorization. + apply the augmentations. This offers a significant performance + boost, but can only be used if all the layers provided to the + `layers` argument support auto vectorization. seed: Integer. Used to create a random seed. """ diff --git a/keras_cv/layers/preprocessing/random_brightness.py b/keras_cv/layers/preprocessing/random_brightness.py index 64852845f5..51a9e5fab8 100644 --- a/keras_cv/layers/preprocessing/random_brightness.py +++ b/keras_cv/layers/preprocessing/random_brightness.py @@ -15,7 +15,7 @@ import tensorflow as tf from tensorflow import keras -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) from keras_cv.utils import preprocessing as preprocessing_utils @@ -28,8 +28,8 @@ class RandomBrightness(VectorizedBaseImageAugmentationLayer): This layer will randomly increase/reduce the brightness for the input RGB images. - Note that different brightness adjustment factors will be apply to each the - images in the batch. + Note that different brightness adjustment factors + will be applied to each the images in the batch. Args: factor: Float or a list/tuple of 2 floats between -1.0 and 1.0. The @@ -40,9 +40,9 @@ class RandomBrightness(VectorizedBaseImageAugmentationLayer): is provided, eg, 0.2, then -0.2 will be used for lower bound and 0.2 will be used for upper bound. value_range: Optional list/tuple of 2 floats for the lower and upper limit - of the values of the input data. Defaults to [0.0, 255.0]. Can be + of the values of the input data, defaults to [0.0, 255.0]. Can be changed to e.g. [0.0, 1.0] if the image input has been scaled before - this layer. The brightness adjustment will be scaled to this range, and + this layer. The brightness adjustment will be scaled to this range, and the output values will be clipped to this range. seed: optional integer, for fixed RNG behavior. Inputs: 3D (HWC) or 4D (NHWC) tensor, with float or int dtype. Input pixel diff --git a/keras_cv/layers/preprocessing/random_channel_shift.py b/keras_cv/layers/preprocessing/random_channel_shift.py index 43a4e4431f..220b970351 100644 --- a/keras_cv/layers/preprocessing/random_channel_shift.py +++ b/keras_cv/layers/preprocessing/random_channel_shift.py @@ -39,20 +39,21 @@ class RandomChannelShift(BaseImageAugmentationLayer): value_range: The range of values the incoming images will have. Represented as a two number tuple written [low, high]. This is typically either `[0, 1]` or `[0, 255]` depending - on how your preprocessing pipeline is setup. + on how your preprocessing pipeline is set up. factor: A scalar value, or tuple/list of two floating values in the range `[0.0, 1.0]`. If `factor` is a single value, it will interpret as equivalent to the tuple `(0.0, factor)`. The `factor` - will sampled between its range for every image to augment. - channels: integer, the number of channels to shift. Defaults to 3 which - corresponds to an RGB shift. In some cases, there may ber more or less - channels. + will sample between its range for every image to augment. + channels: integer, the number of channels to shift, defaults to 3 which + corresponds to an RGB shift. In some cases, there may ber more or + less channels. seed: Integer. Used to create a random seed. Usage: ```python (images, labels), _ = keras.datasets.cifar10.load_data() - rgb_shift = keras_cv.layers.RandomChannelShift(value_range=(0, 255), factor=0.5) + rgb_shift = keras_cv.layers.RandomChannelShift(value_range=(0, 255), + factor=0.5) augmented_images = rgb_shift(images) ``` """ diff --git a/keras_cv/layers/preprocessing/random_choice.py b/keras_cv/layers/preprocessing/random_choice.py index 654735e657..c34bb26c36 100644 --- a/keras_cv/layers/preprocessing/random_choice.py +++ b/keras_cv/layers/preprocessing/random_choice.py @@ -24,9 +24,9 @@ class RandomChoice(BaseImageAugmentationLayer): """RandomChoice constructs a pipeline based on provided arguments. - The implemented policy does the following: for each inputs provided in `call`(), the - policy selects a random layer from the provided list of `layers`. It then calls the - `layer()` on the inputs. + The implemented policy does the following: for each input provided in + `call`(), the policy selects a random layer from the provided list of + `layers`. It then calls the `layer()` on the inputs. Usage: ```python @@ -34,7 +34,8 @@ class RandomChoice(BaseImageAugmentationLayer): layers = keras_cv.layers.RandAugment.get_standard_policy( value_range=(0, 255), magnitude=0.75, magnitude_stddev=0.3 ) - layers = layers[:4] # slice out some layers you don't want for whatever reason + layers = layers[:4] # slice out some layers you don't want for whatever + reason layers = layers + [keras_cv.layers.GridMask()] # create the pipeline. @@ -44,13 +45,13 @@ class RandomChoice(BaseImageAugmentationLayer): ``` Args: - layers: a list of `keras.Layers`. These are randomly inputs during - augmentation to augment the inputs passed in `call()`. The layers passed - should subclass `BaseImageAugmentationLayer`. + layers: a list of `keras.Layers`. These are randomly inputs during + augmentation to augment the inputs passed in `call()`. The layers + passed should subclass `BaseImageAugmentationLayer`. auto_vectorize: whether to use `tf.vectorized_map` or `tf.map_fn` to - apply the augmentations. This offers a significant performance boost, but - can only be used if all the layers provided to the `layers` argument - support auto vectorization. + apply the augmentations. This offers a significant performance + boost, but can only be used if all the layers provided to the + `layers` argument support auto vectorization. seed: Integer. Used to create a random seed. """ diff --git a/keras_cv/layers/preprocessing/random_color_degeneration.py b/keras_cv/layers/preprocessing/random_color_degeneration.py index e4b51b1e1c..f8ba575ec0 100644 --- a/keras_cv/layers/preprocessing/random_color_degeneration.py +++ b/keras_cv/layers/preprocessing/random_color_degeneration.py @@ -25,22 +25,22 @@ class RandomColorDegeneration(BaseImageAugmentationLayer): """Randomly performs the color degeneration operation on given images. - The sharpness operation first converts an image to gray scale, then back to color. - It then takes a weighted average between original image and the degenerated image. - This makes colors appear more dull. + The sharpness operation first converts an image to gray scale, then back to + color. It then takes a weighted average between original image and the + degenerated image. This makes colors appear more dull. Args: factor: A tuple of two floats, a single float or a `keras_cv.FactorSampler`. `factor` controls the extent to which the - image sharpness is impacted. `factor=0.0` makes this layer perform a no-op - operation, while a value of 1.0 uses the degenerated result entirely. - Values between 0 and 1 result in linear interpolation between the original - image and the sharpened image. - Values should be between `0.0` and `1.0`. If a tuple is used, a `factor` is - sampled between the two values for every image augmented. If a single float - is used, a value between `0.0` and the passed float is sampled. In order to - ensure the value is always the same, please pass a tuple with two identical - floats: `(0.5, 0.5)`. + image sharpness is impacted. `factor=0.0` makes this layer perform a + no-op operation, while a value of 1.0 uses the degenerated result + entirely. Values between 0 and 1 result in linear interpolation + between the original image and the sharpened image. + Values should be between `0.0` and `1.0`. If a tuple is used, a + `factor` is sampled between the two values for every image + augmented. If a single float is used, a value between `0.0` and the + passed float is sampled. In order to ensure the value is always the + same, please pass a tuple with two identical floats: `(0.5, 0.5)`. seed: Integer. Used to create a random seed. """ diff --git a/keras_cv/layers/preprocessing/random_color_degeneration_test.py b/keras_cv/layers/preprocessing/random_color_degeneration_test.py index 16fa93e77a..03708d66ed 100644 --- a/keras_cv/layers/preprocessing/random_color_degeneration_test.py +++ b/keras_cv/layers/preprocessing/random_color_degeneration_test.py @@ -58,7 +58,8 @@ def test_color_degeneration_70p_factor(self): # The formula for luma is result= 0.2989*r + 0.5870*g + 0.1140*b luma_result = 0.2989 + 2 * 0.5870 + 3 * 0.1140 - # with factor=0.7, luma_result should be blended at a 70% rate with the original + # with factor=0.7, luma_result should be blended at a 70% rate with the + # original r_result = luma_result * 0.7 + 1 * 0.3 g_result = luma_result * 0.7 + 2 * 0.3 b_result = luma_result * 0.7 + 3 * 0.3 diff --git a/keras_cv/layers/preprocessing/random_color_jitter.py b/keras_cv/layers/preprocessing/random_color_jitter.py index db1328307b..e0dcbc5710 100644 --- a/keras_cv/layers/preprocessing/random_color_jitter.py +++ b/keras_cv/layers/preprocessing/random_color_jitter.py @@ -15,7 +15,7 @@ from tensorflow import keras from keras_cv.layers import preprocessing -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) from keras_cv.utils import preprocessing as preprocessing_utils @@ -36,10 +36,10 @@ class RandomColorJitter(VectorizedBaseImageAugmentationLayer): `(..., height, width, channels)`, in `channels_last` format Args: - value_range: the range of values the incoming images will have. + value_range: the range of values the incoming images will have. Represented as a two number tuple written [low, high]. This is typically either `[0, 1]` or `[0, 255]` depending - on how your preprocessing pipeline is setup. + on how your preprocessing pipeline is set up. brightness_factor: Float or a list/tuple of 2 floats between -1.0 and 1.0. The factor is used to determine the lower bound and upper bound of the brightness adjustment. A float value will be @@ -61,11 +61,11 @@ class RandomColorJitter(VectorizedBaseImageAugmentationLayer): `keras_cv.FactorSampler`. `factor` controls the extent to which the image sharpness is impacted. `factor=0.0` makes this layer perform a no-op operation, while a value of 1.0 performs the most aggressive - contrast adjustment available. If a tuple is used, a `factor` is sampled - between the two values for every image augmented. If a single float - is used, a value between `0.0` and the passed float is sampled. - In order to ensure the value is always the same, please pass a tuple - with two identical floats: `(0.5, 0.5)`. + contrast adjustment available. If a tuple is used, a `factor` is + sampled between the two values for every image augmented. If a + single float is used, a value between `0.0` and the passed float is + sampled. In order to ensure the value is always the same, please + pass a tuple with two identical floats: `(0.5, 0.5)`. seed: Integer. Used to create a random seed. Usage: diff --git a/keras_cv/layers/preprocessing/random_contrast.py b/keras_cv/layers/preprocessing/random_contrast.py index 564aebf51c..80903dbb74 100644 --- a/keras_cv/layers/preprocessing/random_contrast.py +++ b/keras_cv/layers/preprocessing/random_contrast.py @@ -15,7 +15,7 @@ import tensorflow as tf from tensorflow import keras -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) from keras_cv.utils import preprocessing as preprocessing_utils diff --git a/keras_cv/layers/preprocessing/random_crop.py b/keras_cv/layers/preprocessing/random_crop.py index 7fd8b9c4b8..3aa3d92988 100644 --- a/keras_cv/layers/preprocessing/random_crop.py +++ b/keras_cv/layers/preprocessing/random_crop.py @@ -17,7 +17,7 @@ from tensorflow import keras from keras_cv import bounding_box -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) diff --git a/keras_cv/layers/preprocessing/random_crop_and_resize.py b/keras_cv/layers/preprocessing/random_crop_and_resize.py index 5af4cc7b77..ab59c29e51 100644 --- a/keras_cv/layers/preprocessing/random_crop_and_resize.py +++ b/keras_cv/layers/preprocessing/random_crop_and_resize.py @@ -27,36 +27,37 @@ class RandomCropAndResize(BaseImageAugmentationLayer): """Randomly crops a part of an image and resizes it to provided size. - This implementation takes an intuitive approach, where we crop the images to a - random height and width, and then resize them. To do this, we first sample a - random value for area using `crop_area_factor` and a value for aspect ratio using - `aspect_ratio_factor`. Further we get the new height and width by - dividing and multiplying the old height and width by the random area + This implementation takes an intuitive approach, where we crop the images to + a random height and width, and then resize them. To do this, we first sample + a random value for area using `crop_area_factor` and a value for aspect + ratio using `aspect_ratio_factor`. Further we get the new height and width + by dividing and multiplying the old height and width by the random area respectively. We then sample offsets for height and width and clip them such - that the cropped area does not exceed image boundaries. Finally we do the + that the cropped area does not exceed image boundaries. Finally, we do the actual cropping operation and resize the image to `target_size`. Args: - target_size: A tuple of two integers used as the target size to ultimately crop - images to. + target_size: A tuple of two integers used as the target size to + ultimately crop images to. crop_area_factor: A tuple of two floats, ConstantFactorSampler or - UniformFactorSampler. The ratio of area of the cropped part to - that of original image is sampled using this factor. Represents the - lower and upper bounds for the area relative to the original image - of the cropped image before resizing it to `target_size`. For + UniformFactorSampler. The ratio of area of the cropped part to that + of original image is sampled using this factor. Represents the lower + and upper bounds for the area relative to the original image of the + cropped image before resizing it to `target_size`. For self-supervised pretraining a common value for this parameter is - `(0.08, 1.0)`. For fine tuning and classification a common value for this - is `0.8, 1.0`. + `(0.08, 1.0)`. For fine tuning and classification a common value for + this is `0.8, 1.0`. aspect_ratio_factor: A tuple of two floats, ConstantFactorSampler or UniformFactorSampler. Aspect ratio means the ratio of width to - height of the cropped image. In the context of this layer, the aspect ratio - sampled represents a value to distort the aspect ratio by. - Represents the lower and upper bound for the aspect ratio of the - cropped image before resizing it to `target_size`. For most tasks, this - should be `(3/4, 4/3)`. To perform a no-op provide the value `(1.0, 1.0)`. + height of the cropped image. In the context of this layer, the + aspect ratio sampled represents a value to distort the aspect ratio + by. Represents the lower and upper bound for the aspect ratio of the + cropped image before resizing it to `target_size`. For most tasks, + this should be `(3/4, 4/3)`. To perform a no-op provide the value + `(1.0, 1.0)`. interpolation: (Optional) A string specifying the sampling method for - resizing. Defaults to "bilinear". - seed: (Optional) Used to create a random seed. Defaults to None. + resizing, defaults to "bilinear". + seed: (Optional) Used to create a random seed, defaults to None. """ def __init__( @@ -224,9 +225,9 @@ def _check_class_arguments( or isinstance(crop_area_factor, int) ): raise ValueError( - "`crop_area_factor` must be tuple of two positive floats less than " - "or equal to 1 or keras_cv.core.FactorSampler instance. Received " - f"crop_area_factor={crop_area_factor}" + "`crop_area_factor` must be tuple of two positive floats less " + "than or equal to 1 or keras_cv.core.FactorSampler instance. " + f"Received crop_area_factor={crop_area_factor}" ) if ( diff --git a/keras_cv/layers/preprocessing/random_crop_and_resize_test.py b/keras_cv/layers/preprocessing/random_crop_and_resize_test.py index ca32a8ca1e..69d9a14295 100644 --- a/keras_cv/layers/preprocessing/random_crop_and_resize_test.py +++ b/keras_cv/layers/preprocessing/random_crop_and_resize_test.py @@ -68,7 +68,8 @@ def test_grayscale(self): def test_target_size_errors(self, target_size): with self.assertRaisesRegex( ValueError, - "`target_size` must be tuple of two integers. Received target_size=(.*)", + "`target_size` must be tuple of two integers. " + "Received target_size=(.*)", ): _ = preprocessing.RandomCropAndResize( target_size=target_size, @@ -85,7 +86,8 @@ def test_aspect_ratio_factor_errors(self, aspect_ratio_factor): with self.assertRaisesRegex( ValueError, "`aspect_ratio_factor` must be tuple of two positive floats or " - "keras_cv.core.FactorSampler instance. Received aspect_ratio_factor=(.*)", + "keras_cv.core.FactorSampler instance. " + "Received aspect_ratio_factor=(.*)", ): _ = preprocessing.RandomCropAndResize( target_size=(224, 224), @@ -101,9 +103,9 @@ def test_aspect_ratio_factor_errors(self, aspect_ratio_factor): def test_crop_area_factor_errors(self, crop_area_factor): with self.assertRaisesRegex( ValueError, - "`crop_area_factor` must be tuple of two positive floats less than or " - "equal to 1 or keras_cv.core.FactorSampler instance. Received " - "crop_area_factor=(.*)", + "`crop_area_factor` must be tuple of two positive floats less than " + "or equal to 1 or keras_cv.core.FactorSampler instance. " + "Received crop_area_factor=(.*)", ): _ = preprocessing.RandomCropAndResize( target_size=(224, 224), diff --git a/keras_cv/layers/preprocessing/random_cutout.py b/keras_cv/layers/preprocessing/random_cutout.py index 6826bfbb88..478a49cd99 100644 --- a/keras_cv/layers/preprocessing/random_cutout.py +++ b/keras_cv/layers/preprocessing/random_cutout.py @@ -28,25 +28,25 @@ class RandomCutout(BaseImageAugmentationLayer): Args: height_factor: A tuple of two floats, a single float or a - `keras_cv.FactorSampler`. `height_factor` controls the size of the - cutouts. `height_factor=0.0` means the rectangle will be of size 0% of the - image height, `height_factor=0.1` means the rectangle will have a size of - 10% of the image height, and so forth. - Values should be between `0.0` and `1.0`. If a tuple is used, a - `height_factor` is sampled between the two values for every image augmented. - If a single float is used, a value between `0.0` and the passed float is - sampled. In order to ensure the value is always the same, please pass a - tuple with two identical floats: `(0.5, 0.5)`. + `keras_cv.FactorSampler`. `height_factor` controls the size of the + cutouts. `height_factor=0.0` means the rectangle will be of size 0% + of the image height, `height_factor=0.1` means the rectangle will + have a size of 10% of the image height, and so forth. Values should + be between `0.0` and `1.0`. If a tuple is used, a `height_factor` + is sampled between the two values for every image augmented. If a + single float is used, a value between `0.0` and the passed float is + sampled. In order to ensure the value is always the same, please + pass a tuple with two identical floats: `(0.5, 0.5)`. width_factor: A tuple of two floats, a single float or a - `keras_cv.FactorSampler`. `width_factor` controls the size of the - cutouts. `width_factor=0.0` means the rectangle will be of size 0% of the - image height, `width_factor=0.1` means the rectangle will have a size of 10% - of the image width, and so forth. - Values should be between `0.0` and `1.0`. If a tuple is used, a - `width_factor` is sampled between the two values for every image augmented. - If a single float is used, a value between `0.0` and the passed float is - sampled. In order to ensure the value is always the same, please pass a - tuple with two identical floats: `(0.5, 0.5)`. + `keras_cv.FactorSampler`. `width_factor` controls the size of the + cutouts. `width_factor=0.0` means the rectangle will be of size 0% + of the image height, `width_factor=0.1` means the rectangle will + have a size of 10% of the image width, and so forth. + Values should be between `0.0` and `1.0`. If a tuple is used, a + `width_factor` is sampled between the two values for every image + augmented. If a single float is used, a value between `0.0` and the + passed float is sampled. In order to ensure the value is always the + same, please pass a tuple with two identical floats: `(0.5, 0.5)`. fill_mode: Pixels inside the patches are filled according to the given mode (one of `{"constant", "gaussian_noise"}`). - *constant*: Pixels are filled with the same constant value. @@ -87,7 +87,7 @@ def __init__( if fill_mode not in ["gaussian_noise", "constant"]: raise ValueError( '`fill_mode` should be "gaussian_noise" ' - f'or "constant". Got `fill_mode`={fill_mode}' + f'or "constant". Got `fill_mode`={fill_mode}' ) def _parse_bounds(self, factor): diff --git a/keras_cv/layers/preprocessing/random_flip.py b/keras_cv/layers/preprocessing/random_flip.py index eb79504e6d..cf639d6aac 100644 --- a/keras_cv/layers/preprocessing/random_flip.py +++ b/keras_cv/layers/preprocessing/random_flip.py @@ -48,11 +48,12 @@ class RandomFlip(BaseImageAugmentationLayer): Arguments: mode: String indicating which flip mode to use. Can be `"horizontal"`, - `"vertical"`, or `"horizontal_and_vertical"`. Defaults to + `"vertical"`, or `"horizontal_and_vertical"`, defaults to `"horizontal"`. `"horizontal"` is a left-right flip and `"vertical"` is a top-bottom flip. seed: Integer. Used to create a random seed. - bounding_box_format: The format of bounding boxes of input dataset. Refer to + bounding_box_format: The format of bounding boxes of input dataset. + Refer to https://github.com/keras-team/keras-cv/blob/master/keras_cv/bounding_box/converters.py for more details on supported bounding box formats. """ diff --git a/keras_cv/layers/preprocessing/random_gaussian_blur.py b/keras_cv/layers/preprocessing/random_gaussian_blur.py index e26deae198..3f6b0e837e 100644 --- a/keras_cv/layers/preprocessing/random_gaussian_blur.py +++ b/keras_cv/layers/preprocessing/random_gaussian_blur.py @@ -26,16 +26,17 @@ class RandomGaussianBlur(BaseImageAugmentationLayer): """Applies a Gaussian Blur with random strength to an image. Args: - kernel_size: int, 2 element tuple or 2 element list. x and y dimensions for - the kernel used. If tuple or list, first element is used for the x dimension - and second element is used for y dimension. If int, kernel will be squared. + kernel_size: int, 2 element tuple or 2 element list. x and y dimensions + for the kernel used. If tuple or list, first element is used for the + x dimension and second element is used for y dimension. If int, + kernel will be squared. factor: A tuple of two floats, a single float or a `keras_cv.FactorSampler`. `factor` controls the extent to which the - image is blurred. Mathematically, `factor` represents the `sigma` value in - a gaussian blur. `factor=0.0` makes this layer perform a no-op - operation, and high values make the blur stronger. In order to - ensure the value is always the same, please pass a tuple with two identical - floats: `(0.5, 0.5)`. + image is blurred. Mathematically, `factor` represents the `sigma` + value in a gaussian blur. `factor=0.0` makes this layer perform a + no-op operation, and high values make the blur stronger. In order to + ensure the value is always the same, please pass a tuple with two + identical floats: `(0.5, 0.5)`. """ def __init__(self, kernel_size, factor, **kwargs): @@ -100,8 +101,9 @@ def augment_segmentation_mask( @staticmethod def get_kernel(factor, filter_size): - # We are running this in float32, regardless of layer's self.compute_dtype. - # Calculating blur_filter in lower precision will corrupt the final results. + # We are running this in float32, regardless of layer's + # self.compute_dtype. Calculating blur_filter in lower precision will + # corrupt the final results. x = tf.cast( tf.range(-filter_size // 2 + 1, filter_size // 2 + 1), dtype=tf.float32, diff --git a/keras_cv/layers/preprocessing/random_hue.py b/keras_cv/layers/preprocessing/random_hue.py index 2c93c4588e..8128a35c2a 100644 --- a/keras_cv/layers/preprocessing/random_hue.py +++ b/keras_cv/layers/preprocessing/random_hue.py @@ -15,7 +15,7 @@ import tensorflow as tf from tensorflow import keras -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) from keras_cv.utils import preprocessing as preprocessing_utils @@ -32,18 +32,19 @@ class RandomHue(VectorizedBaseImageAugmentationLayer): hue channel (H) by delta. The image is then converted back to RGB. Args: - factor: A tuple of two floats, a single float or `keras_cv.FactorSampler`. - `factor` controls the extent to which the image hue is impacted. - `factor=0.0` makes this layer perform a no-op operation, while a value of - 1.0 performs the most aggressive contrast adjustment available. If a tuple - is used, a `factor` is sampled between the two values for every image - augmented. If a single float is used, a value between `0.0` and the passed - float is sampled. In order to ensure the value is always the same, please + factor: A tuple of two floats, a single float or + `keras_cv.FactorSampler`. `factor` controls the extent to which the + image hue is impacted. `factor=0.0` makes this layer perform a + no-op operation, while a value of 1.0 performs the most aggressive + contrast adjustment available. If a tuple is used, a `factor` is + sampled between the two values for every image augmented. If a + single float is used, a value between `0.0` and the passed float is + sampled. In order to ensure the value is always the same, please pass a tuple with two identical floats: `(0.5, 0.5)`. - value_range: the range of values the incoming images will have. - Represented as a two number tuple written [low, high]. - This is typically either `[0, 1]` or `[0, 255]` depending - on how your preprocessing pipeline is setup. + value_range: the range of values the incoming images will have. + Represented as a two number tuple written [low, high]. This is + typically either `[0, 1]` or `[0, 255]` depending on how your + preprocessing pipeline is set up. seed: Integer. Used to create a random seed. Usage: @@ -69,9 +70,10 @@ def get_random_transformation_batch(self, batch_size, **kwargs): invert = tf.where( invert > 0.5, -tf.ones_like(invert), tf.ones_like(invert) ) - # We must scale self.factor() to the range [-0.5, 0.5]. This is because the - # tf.image operation performs rotation on the hue saturation value orientation. - # This can be thought of as an angle in the range [-180, 180] + # We must scale self.factor() to the range [-0.5, 0.5]. This is because + # the tf.image operation performs rotation on the hue saturation value + # orientation. This can be thought of as an angle in the range + # [-180, 180] return invert * self.factor(shape=(batch_size,)) * 0.5 def augment_ragged_image(self, image, transformation, **kwargs): diff --git a/keras_cv/layers/preprocessing/random_hue_test.py b/keras_cv/layers/preprocessing/random_hue_test.py index 5171417c7a..f07de07d04 100644 --- a/keras_cv/layers/preprocessing/random_hue_test.py +++ b/keras_cv/layers/preprocessing/random_hue_test.py @@ -46,8 +46,8 @@ def test_adjust_full_opposite_hue(self): channel_max = tf.math.reduce_max(output, axis=-1) channel_min = tf.math.reduce_min(output, axis=-1) - # Make sure the max and min channel are the same between input and output - # In the meantime, and channel will swap between each other. + # Make sure the max and min channel are the same between input and + # output. In the meantime, and channel will swap between each other. self.assertAllClose( channel_max, tf.math.reduce_max(image, axis=-1), diff --git a/keras_cv/layers/preprocessing/random_jpeg_quality.py b/keras_cv/layers/preprocessing/random_jpeg_quality.py index bc0efa8110..0d44b115d6 100644 --- a/keras_cv/layers/preprocessing/random_jpeg_quality.py +++ b/keras_cv/layers/preprocessing/random_jpeg_quality.py @@ -25,12 +25,13 @@ class RandomJpegQuality(BaseImageAugmentationLayer): """Applies Random Jpeg compression artifacts to an image. - Performs the jpeg compression algorithm on the image. This layer can used in order - to ensure your model is robust to artifacts introduced by JPEG compresion. + Performs the jpeg compression algorithm on the image. This layer can be used + in order to ensure your model is robust to artifacts introduced by JPEG + compression. Args: - factor: 2 element tuple or 2 element list. During augmentation, a random number - is drawn from the factor distribution. This value is passed to + factor: 2 element tuple or 2 element list. During augmentation, a random + number is drawn from the factor distribution. This value is passed to `tf.image.adjust_jpeg_quality()`. seed: Integer. Used to create a random seed. diff --git a/keras_cv/layers/preprocessing/random_rotation.py b/keras_cv/layers/preprocessing/random_rotation.py index dc0e3cadf2..af200af2dc 100644 --- a/keras_cv/layers/preprocessing/random_rotation.py +++ b/keras_cv/layers/preprocessing/random_rotation.py @@ -17,13 +17,13 @@ from tensorflow import keras from keras_cv import bounding_box -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) from keras_cv.utils import preprocessing as preprocessing_utils # In order to support both unbatched and batched inputs, the horizontal -# and verticle axis is reverse indexed +# and vertical axis is reverse indexed H_AXIS = -3 W_AXIS = -2 @@ -36,7 +36,7 @@ class RandomRotation(VectorizedBaseImageAugmentationLayer): according to `fill_mode`. Input pixel values can be of any range (e.g. `[0., 1.)` or `[0, 255]`) and - of interger or floating point dtype. By default, the layer will output + of integer or floating point dtype. By default, the layer will output floats. Input shape: @@ -212,7 +212,7 @@ def augment_bounding_boxes( bounding_box_format="xyxy", images=raw_images, ) - # cordinates cannot be float values, it is casted to int32 + # coordinates cannot be float values, it is cast to int32 bounding_boxes = bounding_box.convert_format( bounding_boxes, source="xyxy", diff --git a/keras_cv/layers/preprocessing/random_rotation_test.py b/keras_cv/layers/preprocessing/random_rotation_test.py index 85c9020a59..fc23fc71bd 100644 --- a/keras_cv/layers/preprocessing/random_rotation_test.py +++ b/keras_cv/layers/preprocessing/random_rotation_test.py @@ -157,7 +157,8 @@ def test_augment_sparse_segmentation_mask(self): masks = np.random.randint(2, size=(2, 20, 20, 1)) * (num_classes - 1) inputs = {"images": input_images, "segmentation_masks": masks} - # Attempting to rotate a sparse mask without specifying num_classes fails. + # Attempting to rotate a sparse mask without specifying num_classes + # fails. bad_layer = RandomRotation(factor=(0.25, 0.25)) with self.assertRaisesRegex(ValueError, "masks must be one-hot"): outputs = bad_layer(inputs) @@ -170,7 +171,7 @@ def test_augment_sparse_segmentation_mask(self): expected_masks = np.rot90(masks, axes=(1, 2)) self.assertAllClose(expected_masks, outputs["segmentation_masks"]) - # 45 degree rotation. Only verifies that no interpolation takes place. + # 45-degree rotation. Only verifies that no interpolation takes place. layer = RandomRotation( factor=(0.125, 0.125), segmentation_classes=num_classes ) diff --git a/keras_cv/layers/preprocessing/random_saturation.py b/keras_cv/layers/preprocessing/random_saturation.py index 135e4c7f72..9cb40bd629 100644 --- a/keras_cv/layers/preprocessing/random_saturation.py +++ b/keras_cv/layers/preprocessing/random_saturation.py @@ -15,7 +15,7 @@ import tensorflow as tf from tensorflow import keras -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) from keras_cv.utils import preprocessing as preprocessing_utils @@ -29,16 +29,16 @@ class RandomSaturation(VectorizedBaseImageAugmentationLayer): images. Args: - factor: A tuple of two floats, a single float or `keras_cv.FactorSampler`. - `factor` controls the extent to which the image saturation is impacted. - `factor=0.5` makes this layer perform a no-op operation. `factor=0.0` makes - the image to be fully grayscale. `factor=1.0` makes the image to be fully - saturated. - Values should be between `0.0` and `1.0`. If a tuple is used, a `factor` - is sampled between the two values for every image augmented. If a single - float is used, a value between `0.0` and the passed float is sampled. - In order to ensure the value is always the same, please pass a tuple with - two identical floats: `(0.5, 0.5)`. + factor: A tuple of two floats, a single float or + `keras_cv.FactorSampler`. `factor` controls the extent to which the + image saturation is impacted. `factor=0.5` makes this layer perform + a no-op operation. `factor=0.0` makes the image to be fully + grayscale. `factor=1.0` makes the image to be fully saturated. + Values should be between `0.0` and `1.0`. If a tuple is used, a + `factor` is sampled between the two values for every image + augmented. If a single float is used, a value between `0.0` and the + passed float is sampled. In order to ensure the value is always the + same, please pass a tuple with two identical floats: `(0.5, 0.5)`. seed: Integer. Used to create a random seed. Usage: @@ -68,9 +68,9 @@ def augment_ragged_image(self, image, transformation, **kwargs): def augment_images(self, images, transformations, **kwargs): # Convert the factor range from [0, 1] to [0, +inf]. Note that the - # tf.image.adjust_saturation is trying to apply the following math formula - # `output_saturation = input_saturation * factor`. We use the following - # method to the do the mapping. + # tf.image.adjust_saturation is trying to apply the following math + # formula `output_saturation = input_saturation * factor`. We use the + # following method to the do the mapping. # `y = x / (1 - x)`. # This will ensure: # y = +inf when x = 1 (full saturation) @@ -78,8 +78,8 @@ def augment_images(self, images, transformations, **kwargs): # y = 0 when x = 0 (full gray scale) # Convert the transformation to tensor in case it is a float. When - # transformation is 1.0, then it will result in to divide by zero error, but - # it will be handled correctly when it is a one tensor. + # transformation is 1.0, then it will result in to divide by zero error, + # but it will be handled correctly when it is a one tensor. transformations = tf.convert_to_tensor(transformations) adjust_factors = transformations / (1 - transformations) adjust_factors = tf.cast(adjust_factors, dtype=images.dtype) diff --git a/keras_cv/layers/preprocessing/random_saturation_test.py b/keras_cv/layers/preprocessing/random_saturation_test.py index 253a0d9f5b..dde5643a85 100644 --- a/keras_cv/layers/preprocessing/random_saturation_test.py +++ b/keras_cv/layers/preprocessing/random_saturation_test.py @@ -31,16 +31,16 @@ class OldRandomSaturation(BaseImageAugmentationLayer): Call the layer with `training=True` to adjust the saturation of the input. Args: - factor: A tuple of two floats, a single float or `keras_cv.FactorSampler`. - `factor` controls the extent to which the image saturation is impacted. - `factor=0.5` makes this layer perform a no-op operation. `factor=0.0` makes - the image to be fully grayscale. `factor=1.0` makes the image to be fully - saturated. - Values should be between `0.0` and `1.0`. If a tuple is used, a `factor` - is sampled between the two values for every image augmented. If a single - float is used, a value between `0.0` and the passed float is sampled. - In order to ensure the value is always the same, please pass a tuple with - two identical floats: `(0.5, 0.5)`. + factor: A tuple of two floats, a single float or + `keras_cv.FactorSampler`. `factor` controls the extent to which the + image saturation is impacted. `factor=0.5` makes this layer perform + a no-op operation. `factor=0.0` makes the image to be fully + grayscale. `factor=1.0` makes the image to be fully saturated. + Values should be between `0.0` and `1.0`. If a tuple is used, a + `factor` is sampled between the two values for every image + augmented. If a single float is used, a value between `0.0` and the + passed float is sampled. In order to ensure the value is always the + same, please pass a tuple with two identical floats: `(0.5, 0.5)`. seed: Integer. Used to create a random seed. """ @@ -58,9 +58,9 @@ def get_random_transformation(self, **kwargs): def augment_image(self, image, transformation=None, **kwargs): # Convert the factor range from [0, 1] to [0, +inf]. Note that the - # tf.image.adjust_saturation is trying to apply the following math formula - # `output_saturation = input_saturation * factor`. We use the following - # method to the do the mapping. + # tf.image.adjust_saturation is trying to apply the following math + # formula `output_saturation = input_saturation * factor`. We use the + # following method to the do the mapping. # `y = x / (1 - x)`. # This will ensure: # y = +inf when x = 1 (full saturation) @@ -68,8 +68,8 @@ def augment_image(self, image, transformation=None, **kwargs): # y = 0 when x = 0 (full gray scale) # Convert the transformation to tensor in case it is a float. When - # transformation is 1.0, then it will result in to divide by zero error, but - # it will be handled correctly when it is a one tensor. + # transformation is 1.0, then it will result in to divide by zero error, + # but it will be handled correctly when it is a one tensor. transformation = tf.convert_to_tensor(transformation) adjust_factor = transformation / (1 - transformation) return tf.image.adjust_saturation( @@ -135,8 +135,8 @@ def test_adjust_to_grayscale(self): channel_mean = tf.math.reduce_mean(output, axis=-1) channel_values = tf.unstack(output, axis=-1) - # Make sure all the pixel has the same value among the channel dim, which is - # a fully gray RGB. + # Make sure all the pixel has the same value among the channel dim, + # which is a fully gray RGB. for channel_value in channel_values: self.assertAllClose( channel_mean, channel_value, atol=1e-5, rtol=1e-5 diff --git a/keras_cv/layers/preprocessing/random_sharpness.py b/keras_cv/layers/preprocessing/random_sharpness.py index 60be8b6dee..26a65b44db 100644 --- a/keras_cv/layers/preprocessing/random_sharpness.py +++ b/keras_cv/layers/preprocessing/random_sharpness.py @@ -15,7 +15,7 @@ import tensorflow as tf from tensorflow import keras -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) from keras_cv.utils import preprocessing @@ -25,29 +25,30 @@ class RandomSharpness(VectorizedBaseImageAugmentationLayer): """Randomly performs the sharpness operation on given images. - The sharpness operation first performs a blur operation, then blends between the - original image and the blurred image. This operation makes the edges of an image - less sharp than they were in the original image. + The sharpness operation first performs a blur operation, then blends between + the original image and the blurred image. This operation makes the edges of + an image less sharp than they were in the original image. References: - [PIL](https://pillow.readthedocs.io/en/stable/reference/ImageEnhance.html) Args: - factor: A tuple of two floats, a single float or `keras_cv.FactorSampler`. - `factor` controls the extent to which the image sharpness is impacted. - `factor=0.0` makes this layer perform a no-op operation, while a value of - 1.0 uses the sharpened result entirely. Values between 0 and 1 result in - linear interpolation between the original image and the sharpened image. - Values should be between `0.0` and `1.0`. If a tuple is used, a `factor` is - sampled between the two values for every image augmented. If a single float - is used, a value between `0.0` and the passed float is sampled. In order to - ensure the value is always the same, please pass a tuple with two identical - floats: `(0.5, 0.5)`. + factor: A tuple of two floats, a single float or + `keras_cv.FactorSampler`. `factor` controls the extent to which the + image sharpness is impacted. `factor=0.0` makes this layer perform a + no-op operation, while a value of 1.0 uses the sharpened result + entirely. Values between 0 and 1 result in linear interpolation + between the original image and the sharpened image. Values should be + between `0.0` and `1.0`. If a tuple is used, a `factor` is sampled + between the two values for every image augmented. If a single float + is used, a value between `0.0` and the passed float is sampled. In + order to ensure the value is always the same, please pass a tuple + with two identical floats: `(0.5, 0.5)`. value_range: the range of values the incoming images will have. Represented as a two number tuple written [low, high]. This is typically either `[0, 1]` or `[0, 255]` depending - on how your preprocessing pipeline is setup. - """ + on how your preprocessing pipeline is set up. + """ # noqa: E501 def __init__( self, diff --git a/keras_cv/layers/preprocessing/random_shear.py b/keras_cv/layers/preprocessing/random_shear.py index af8ac439bf..3adf17beaa 100644 --- a/keras_cv/layers/preprocessing/random_shear.py +++ b/keras_cv/layers/preprocessing/random_shear.py @@ -19,7 +19,7 @@ import keras_cv from keras_cv import bounding_box -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) from keras_cv.utils import preprocessing @@ -43,33 +43,32 @@ class RandomShear(VectorizedBaseImageAugmentationLayer): Args: x_factor: A tuple of two floats, a single float or a - `keras_cv.FactorSampler`. For each augmented image a value is sampled - from the provided range. If a float is passed, the range is interpreted as - `(0, x_factor)`. Values represent a percentage of the image to shear over. - For example, 0.3 shears pixels up to 30% of the way across the image. - All provided values should be positive. If `None` is passed, no shear - occurs on the X axis. - Defaults to `None`. + `keras_cv.FactorSampler`. For each augmented image a value is + sampled from the provided range. If a float is passed, the range is + interpreted as `(0, x_factor)`. Values represent a percentage of the + image to shear over. For example, 0.3 shears pixels up to 30% of the + way across the image. All provided values should be positive. If + `None` is passed, no shear occurs on the X axis. Defaults to `None`. y_factor: A tuple of two floats, a single float or a - `keras_cv.FactorSampler`. For each augmented image a value is sampled - from the provided range. If a float is passed, the range is interpreted as - `(0, y_factor)`. Values represent a percentage of the image to shear over. - For example, 0.3 shears pixels up to 30% of the way across the image. - All provided values should be positive. If `None` is passed, no shear - occurs on the Y axis. - Defaults to `None`. - interpolation: interpolation method used in the `ImageProjectiveTransformV3` op. - Supported values are `"nearest"` and `"bilinear"`. - Defaults to `"bilinear"`. - fill_mode: fill_mode in the `ImageProjectiveTransformV3` op. - Supported values are `"reflect"`, `"wrap"`, `"constant"`, and `"nearest"`. - Defaults to `"reflect"`. - fill_value: fill_value in the `ImageProjectiveTransformV3` op. - A `Tensor` of type `float32`. The value to be filled when fill_mode is - constant". Defaults to `0.0`. - bounding_box_format: The format of bounding boxes of input dataset. Refer to - https://github.com/keras-team/keras-cv/blob/master/keras_cv/bounding_box/converters.py - for more details on supported bounding box formats. + `keras_cv.FactorSampler`. For each augmented image a value is + sampled from the provided range. If a float is passed, the range is + interpreted as `(0, y_factor)`. Values represent a percentage of the + image to shear over. For example, 0.3 shears pixels up to 30% of the + way across the image. All provided values should be positive. If + `None` is passed, no shear occurs on the Y axis. Defaults to `None`. + interpolation: interpolation method used in the + `ImageProjectiveTransformV3` op. Supported values are `"nearest"` + and `"bilinear"`, defaults to `"bilinear"`. + fill_mode: fill_mode in the `ImageProjectiveTransformV3` op. Supported + values are `"reflect"`, `"wrap"`, `"constant"`, and `"nearest"`. + Defaults to `"reflect"`. + fill_value: fill_value in the `ImageProjectiveTransformV3` op. A + `Tensor` of type `float32`. The value to be filled when fill_mode is + constant". Defaults to `0.0`. + bounding_box_format: The format of bounding boxes of input dataset. + Refer to + https://github.com/keras-team/keras-cv/blob/master/keras_cv/bounding_box/converters.py + for more details on supported bounding box formats. seed: Integer. Used to create a random seed. """ diff --git a/keras_cv/layers/preprocessing/random_shear_test.py b/keras_cv/layers/preprocessing/random_shear_test.py index b6e4ba28cf..648b4cd67a 100644 --- a/keras_cv/layers/preprocessing/random_shear_test.py +++ b/keras_cv/layers/preprocessing/random_shear_test.py @@ -161,7 +161,8 @@ def augment(x, y): self.assertNotAllClose(xs, 2.0) def test_no_augmentation(self): - """test for no image and bbox augmenation when x_factor,y_factor is 0,0""" + """test for no image and bbox augmentation when x_factor,y_factor is + 0,0""" xs = tf.cast( tf.stack( [2 * tf.ones((4, 4, 3)), tf.ones((4, 4, 3))], diff --git a/keras_cv/layers/preprocessing/random_translation.py b/keras_cv/layers/preprocessing/random_translation.py index f4b1e4c402..a186aba6a7 100644 --- a/keras_cv/layers/preprocessing/random_translation.py +++ b/keras_cv/layers/preprocessing/random_translation.py @@ -16,7 +16,7 @@ from tensorflow import keras import keras_cv -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) from keras_cv.utils import preprocessing as preprocessing_utils @@ -43,7 +43,7 @@ class RandomTranslation(VectorizedBaseImageAugmentationLayer): shifting image down. When represented as a single positive float, this value is used for both the upper and lower bound. For instance, `height_factor=(-0.2, 0.3)` results in an output shifted by a random - amount in the range `[-20%, +30%]`. `height_factor=0.2` results in an + amount in the range `[-20%, +30%]`. `height_factor=0.2` results in an output height shifted by a random amount in the range `[-20%, +20%]`. width_factor: a float represented as fraction of value, or a tuple of size 2 representing lower and upper bound for shifting horizontally. A @@ -70,10 +70,11 @@ class RandomTranslation(VectorizedBaseImageAugmentationLayer): seed: Integer. Used to create a random seed. fill_value: a float represents the value to be filled outside the boundaries when `fill_mode="constant"`. - bounding_box_format: The format of bounding boxes of input dataset. Refer to + bounding_box_format: The format of bounding boxes of input dataset. + Refer to https://github.com/keras-team/keras-cv/blob/master/keras_cv/bounding_box/converters.py - for more details on supported bounding box formats. This is required when - augmenting data which includes bounding boxes. + for more details on supported bounding box formats. This is required + when augmenting data which includes bounding boxes. Input shape: 3D (unbatched) or 4D (batched) tensor with shape: diff --git a/keras_cv/layers/preprocessing/random_zoom.py b/keras_cv/layers/preprocessing/random_zoom.py index 6bbb175e61..f3dfced101 100644 --- a/keras_cv/layers/preprocessing/random_zoom.py +++ b/keras_cv/layers/preprocessing/random_zoom.py @@ -17,13 +17,13 @@ from keras import backend from tensorflow import keras -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) from keras_cv.utils import preprocessing as preprocessing_utils # In order to support both unbatched and batched inputs, the horizontal -# and verticle axis is reverse indexed +# and vertical axis is reverse indexed H_AXIS = -3 W_AXIS = -2 diff --git a/keras_cv/layers/preprocessing/randomly_zoomed_crop.py b/keras_cv/layers/preprocessing/randomly_zoomed_crop.py index 3e736d0770..b00924135c 100644 --- a/keras_cv/layers/preprocessing/randomly_zoomed_crop.py +++ b/keras_cv/layers/preprocessing/randomly_zoomed_crop.py @@ -17,7 +17,7 @@ from tensorflow import keras from keras_cv import core -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) from keras_cv.utils import preprocessing as preprocessing_utils @@ -34,7 +34,7 @@ class RandomlyZoomedCrop(VectorizedBaseImageAugmentationLayer): amount of distortion in the image is proportional to the `zoom_factor` argument. To do this, we first sample a random value for `zoom_factor` and `aspect_ratio_factor`. Further we deduce a `crop_size` which abides by the - calculated aspect ratio. Finally we do the actual cropping operation and + calculated aspect ratio. Finally, we do the actual cropping operation and resize the image to `(height, width)`. Args: @@ -49,12 +49,12 @@ class RandomlyZoomedCrop(VectorizedBaseImageAugmentationLayer): aspect ratio sampled represents a value to distort the aspect ratio by. Represents the lower and upper bound for the aspect ratio of the - cropped image before resizing it to `(height, width)`. For most - tasks, this should be `(3/4, 4/3)`. To perform a no-op provide the + cropped image before resizing it to `(height, width)`. For most + tasks, this should be `(3/4, 4/3)`. To perform a no-op provide the value `(1.0, 1.0)`. interpolation: (Optional) A string specifying the sampling method for - resizing. Defaults to "bilinear". - seed: (Optional) Used to create a random seed. Defaults to None. + resizing, defaults to "bilinear". + seed: (Optional) Used to create a random seed, defaults to None. """ def __init__( diff --git a/keras_cv/layers/preprocessing/repeated_augmentation.py b/keras_cv/layers/preprocessing/repeated_augmentation.py index 6e979d116b..325b4b1cd5 100644 --- a/keras_cv/layers/preprocessing/repeated_augmentation.py +++ b/keras_cv/layers/preprocessing/repeated_augmentation.py @@ -24,18 +24,19 @@ class RepeatedAugmentation(BaseImageAugmentationLayer): """RepeatedAugmentation augments each image in a batch multiple times. - This technique exists to emulate the behavior of stochastic gradient descent within - the context of mini-batch gradient descent. When training large vision models, - choosing a large batch size can introduce too much noise into aggregated gradients - causing the overall batch's gradients to be less effective than gradients produced - using smaller gradients. RepeatedAugmentation handles this by re-using the same - image multiple times within a batch creating correlated samples. + This technique exists to emulate the behavior of stochastic gradient descent + within the context of mini-batch gradient descent. When training large + vision models, choosing a large batch size can introduce too much noise into + aggregated gradients causing the overall batch's gradients to be less + effective than gradients produced using smaller gradients. + RepeatedAugmentation handles this by re-using the same image multiple times + within a batch creating correlated samples. This layer increases your batch size by a factor of `len(augmenters)`. Args: augmenters: the augmenters to use to augment the image - shuffle: whether or not to shuffle the result. Essential when using an + shuffle: whether to shuffle the result. Essential when using an asynchronous distribution strategy such as ParameterServerStrategy. Usage: @@ -69,10 +70,10 @@ class RepeatedAugmentation(BaseImageAugmentationLayer): ``` References: - - [DEIT implementaton](https://github.com/facebookresearch/deit/blob/ee8893c8063f6937fec7096e47ba324c206e22b9/samplers.py#L8) + - [DEIT implementation](https://github.com/facebookresearch/deit/blob/ee8893c8063f6937fec7096e47ba324c206e22b9/samplers.py#L8) - [Original publication](https://openaccess.thecvf.com/content_CVPR_2020/papers/Hoffer_Augment_Your_Batch_Improving_Generalization_Through_Instance_Repetition_CVPR_2020_paper.pdf) - """ + """ # noqa: E501 def __init__(self, augmenters, shuffle=True, **kwargs): super().__init__(**kwargs) @@ -82,7 +83,8 @@ def __init__(self, augmenters, shuffle=True, **kwargs): def _batch_augment(self, inputs): if "bounding_boxes" in inputs: raise ValueError( - "RepeatedAugmentation() does not yet support bounding box labels." + "RepeatedAugmentation() does not yet support bounding box " + "labels." ) augmenter_outputs = [augmenter(inputs) for augmenter in self.augmenters] @@ -108,7 +110,7 @@ def shuffle_outputs(self, result): def _augment(self, inputs): raise ValueError( - "RepeatedAugmentation() only works in batched mode. If " + "RepeatedAugmentation() only works in batched mode. If " "you would like to create batches from a single image, use " "`x = tf.expand_dims(x, axis=0)` on your input images and labels." ) diff --git a/keras_cv/layers/preprocessing/resizing.py b/keras_cv/layers/preprocessing/resizing.py index 37b33726c4..84c893abf6 100644 --- a/keras_cv/layers/preprocessing/resizing.py +++ b/keras_cv/layers/preprocessing/resizing.py @@ -32,7 +32,7 @@ class Resizing(BaseImageAugmentationLayer): This layer resizes an image input to a target height and width. The input should be a 4D (batched) or 3D (unbatched) tensor in `"channels_last"` - format. Input pixel values can be of any range (e.g. `[0., 1.)` or `[0, + format. Input pixel values can be of any range (e.g. `[0., 1.)` or `[0, 255]`) and of integer or floating point dtype. By default, the layer will output floats. @@ -46,22 +46,24 @@ class Resizing(BaseImageAugmentationLayer): Args: height: Integer, the height of the output shape. width: Integer, the width of the output shape. - interpolation: String, the interpolation method. Defaults to `"bilinear"`. - Supports `"bilinear"`, `"nearest"`, `"bicubic"`, `"area"`, `"lanczos3"`, - `"lanczos5"`, `"gaussian"`, `"mitchellcubic"`. - crop_to_aspect_ratio: If True, resize the images without aspect - ratio distortion. When the original aspect ratio differs from the target - aspect ratio, the output image will be cropped so as to return the - largest possible window in the image (of size `(height, width)`) that - matches the target aspect ratio. By default + interpolation: String, the interpolation method, defaults to + `"bilinear"`. Supports `"bilinear"`, `"nearest"`, `"bicubic"`, + `"area"`, `"lanczos3"`, `"lanczos5"`, `"gaussian"`, + `"mitchellcubic"`. + crop_to_aspect_ratio: If True, resize the images without aspect ratio + distortion. When the original aspect ratio differs from the target + aspect ratio, the output image will be cropped to return the largest + possible window in the image (of size `(height, width)`) that + matches the target aspect ratio. By default, (`crop_to_aspect_ratio=False`), aspect ratio may not be preserved. - pad_to_aspect_ratio: If True, resize the images without aspect - ratio distortion. When the original aspect ratio differs from the target - aspect ratio, the output image will be padded so as to return the - largest possible resize of the image (of size `(height, width)`) that - matches the target aspect ratio. By default + pad_to_aspect_ratio: If True, resize the images without aspect ratio + distortion. When the original aspect ratio differs from the target + aspect ratio, the output image will be padded to return the largest + possible resize of the image (of size `(height, width)`) that + matches the target aspect ratio. By default, (`pad_to_aspect_ratio=False`), aspect ratio may not be preserved. - bounding_box_format: The format of bounding boxes of input dataset. Refer to + bounding_box_format: The format of bounding boxes of input dataset. + Refer to https://github.com/keras-team/keras-cv/blob/master/keras_cv/bounding_box/converters.py for more details on supported bounding box formats. """ @@ -96,7 +98,7 @@ def __init__( if not pad_to_aspect_ratio and bounding_box_format: raise ValueError( "Resizing() only supports bounding boxes when in " - "`pad_to_aspect_ratio=True` mode. " + "`pad_to_aspect_ratio=True` mode. " "Please pass `pad_to_aspect_ratio=True`" "when processing bounding boxes with `Resizing()`" ) @@ -239,15 +241,15 @@ def _resize_with_crop(self, inputs): if bounding_boxes is not None: raise ValueError( "Resizing(crop_to_aspect_ratio=True) does not support " - "bounding box inputs. Please use `pad_to_aspect_ratio=True` when " - "processing bounding boxes with Resizing()." + "bounding box inputs. Please use `pad_to_aspect_ratio=True` " + "when processing bounding boxes with Resizing()." ) inputs["images"] = images size = [self.height, self.width] # tf.image.resize will always output float32 and operate more # efficiently on float32 unless interpolation is nearest, in which case - # ouput type matches input type. + # output type matches input type. if self.interpolation == "nearest": input_dtype = self.compute_dtype else: @@ -288,8 +290,9 @@ def _batch_augment(self, inputs): and self.bounding_box_format is None ): raise ValueError( - "Resizing requires `bounding_box_format` to be set " - "when augmenting bounding boxes, but `self.bounding_box_format=None`." + "Resizing requires `bounding_box_format` to be set when " + "augmenting bounding boxes, but " + "`self.bounding_box_format=None`." ) if self.crop_to_aspect_ratio: diff --git a/keras_cv/layers/preprocessing/solarization.py b/keras_cv/layers/preprocessing/solarization.py index 33e97135a7..eac7202122 100644 --- a/keras_cv/layers/preprocessing/solarization.py +++ b/keras_cv/layers/preprocessing/solarization.py @@ -15,7 +15,7 @@ import tensorflow as tf from tensorflow import keras -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) from keras_cv.utils import preprocessing @@ -25,9 +25,9 @@ class Solarization(VectorizedBaseImageAugmentationLayer): """Applies (max_value - pixel + min_value) for each pixel in the image. - When created without `threshold` parameter, the layer performs solarization to - all values. When created with specified `threshold` the layer only augments - pixels that are above the `threshold` value + When created without `threshold` parameter, the layer performs solarization + to all values. When created with specified `threshold` the layer only + augments pixels that are above the `threshold` value Reference: - [AutoAugment: Learning Augmentation Policies from Data]( @@ -36,21 +36,22 @@ class Solarization(VectorizedBaseImageAugmentationLayer): - [RandAugment](https://arxiv.org/pdf/1909.13719.pdf) Args: - value_range: a tuple or a list of two elements. The first value represents - the lower bound for values in passed images, the second represents the - upper bound. Images passed to the layer should have values within - `value_range`. + value_range: a tuple or a list of two elements. The first value + represents the lower bound for values in passed images, the second + represents the upper bound. Images passed to the layer should have + values within `value_range`. addition_factor: (Optional) A tuple of two floats, a single float or a - `keras_cv.FactorSampler`. For each augmented image a value is sampled - from the provided range. If a float is passed, the range is interpreted as - `(0, addition_factor)`. If specified, this value is added to each pixel - before solarization and thresholding. The addition value should be scaled - according to the value range (0, 255). Defaults to 0.0. - threshold_factor: (Optional) A tuple of two floats, a single float or a - `keras_cv.FactorSampler`. For each augmented image a value is sampled - from the provided range. If a float is passed, the range is interpreted as - `(0, threshold_factor)`. If specified, only pixel values above this - threshold will be solarized. + `keras_cv.FactorSampler`. For each augmented image a value is + sampled from the provided range. If a float is passed, the range is + interpreted as `(0, addition_factor)`. If specified, this value is + added to each pixel before solarization and thresholding. The + addition value should be scaled according to the value range + (0, 255), defaults to 0.0. + threshold_factor: (Optional) A tuple of two floats, a single float or + a `keras_cv.FactorSampler`. For each augmented image a value is + sampled from the provided range. If a float is passed, the range is + interpreted as `(0, threshold_factor)`. If specified, only pixel + values above this threshold will be solarized. seed: Integer. Used to create a random seed. Usage: diff --git a/keras_cv/layers/preprocessing/vectorized_base_image_augmentation_layer.py b/keras_cv/layers/preprocessing/vectorized_base_image_augmentation_layer.py index ec52d07d18..3e742f38bc 100644 --- a/keras_cv/layers/preprocessing/vectorized_base_image_augmentation_layer.py +++ b/keras_cv/layers/preprocessing/vectorized_base_image_augmentation_layer.py @@ -37,11 +37,11 @@ class VectorizedBaseImageAugmentationLayer( keras.__internal__.layers.BaseRandomLayer ): - """Abstract base layer for vectorized image augmentaion. + """Abstract base layer for vectorized image augmentation. This layer contains base functionalities for preprocessing layers which - augment image related data, eg. image and in future, label and bounding - boxes. The subclasses could avoid making certain mistakes and reduce code + augment image related data, e.g. image and in the future, label and bounding + boxes. The subclasses could avoid making certain mistakes and reduce code duplications. This layer requires you to implement one method: `augment_images()`, which @@ -55,10 +55,10 @@ class VectorizedBaseImageAugmentationLayer( the layer supports that. `get_random_transformations()`, which should produce a batch of random - transformation settings. The tranformation object, which must be a batched Tensor or - a dictionary where each input is a batched Tensor, will be - passed to `augment_images`, `augment_labels` and `augment_bounding_boxes`, to - coodinate the randomness behavior, eg, in the RandomFlip layer, the image + transformation settings. The transformation object, which must be a batched + Tensor or a dictionary where each input is a batched Tensor, will be passed + to `augment_images`, `augment_labels` and `augment_bounding_boxes`, to + coordinate the randomness behavior, eg, in the RandomFlip layer, the image and bounding_boxes should be changed in the same way. The `call()` method support two formats of inputs: @@ -75,7 +75,7 @@ class VectorizedBaseImageAugmentationLayer( Note that since the randomness is also a common functionality, this layer also includes a keras.backend.RandomGenerator, which can be used to - produce the random numbers. The random number generator is stored in the + produce the random numbers. The random number generator is stored in the `self._random_generator` attribute. """ @@ -85,34 +85,37 @@ def __init__(self, seed=None, **kwargs): def augment_ragged_image(self, image, transformation, **kwargs): """Augment an image from a ragged image batch during training. - This method accepts a single Dense image Tensor, and returns a Dense image. - The resulting images are then stacked back into a ragged image batch. The - behavior of this method should be identical to that of `augment_images()` but - is to operate on a batch-wise basis. + This method accepts a single Dense image Tensor, and returns a Dense + image. The resulting images are then stacked back into a ragged image + batch. The behavior of this method should be identical to that of + `augment_images()` but is to operate on a batch-wise basis. Args: image: a single image from the batch transformation: a single transformation sampled from `get_random_transformations()`. - kwargs: all of the other call arguments (i.e. bounding_boxes, labels, etc.). + kwargs: all the other call arguments (i.e. bounding_boxes, labels, + etc.). Returns: Augmented image. """ raise NotImplementedError( "A ragged image batch was passed to layer of type " f"`{type(self).__name__}`. This layer does not implement " - "`augment_ragged_image()`. If this is a `keras_cv`, open a GitHub issue " - "requesting Ragged functionality on the layer titled: " - f"'`{type(self).__name__}`: ragged image support'. " - "If this is a custom layer, implement the `augment_ragged_image()` method." + "`augment_ragged_image()`. If this is a `keras_cv`, open a GitHub " + "issue requesting Ragged functionality on the layer titled: " + f"'`{type(self).__name__}`: ragged image support'. If this is a " + "custom layer, implement the `augment_ragged_image()` method." ) def compute_ragged_image_signature(self, images): - """Computes the output image signature for the `augment_image()` function. + """Computes the output image signature for the `augment_image()` + function. - Must be overridden to return tensors with different shapes than the input - images. By default returns either a `tf.RaggedTensorSpec` matching the input - image spec, or a `tf.TensorSpec` matching the input image spec. + Must be overridden to return tensors with different shapes than the + input images. By default, returns either a `tf.RaggedTensorSpec` + matching the input image spec, or a `tf.TensorSpec` matching the input + image spec. """ ragged_spec = tf.RaggedTensorSpec( shape=images.shape[1:], @@ -125,12 +128,13 @@ def augment_images(self, images, transformations, **kwargs): """Augment a batch of images during training. Args: - image: 4D image input tensor to the layer. Forwarded from - `layer.call()`. This should generally have the shape [B, H, W, C]. + images: 4D image input tensor to the layer. Forwarded from + `layer.call()`. This should generally have the shape [B, H, W, C]. Forwarded from `layer.call()`. transformations: The transformations object produced by `get_random_transformations`. Used to coordinate the randomness - between image, label, bounding box, keypoints, and segmentation mask. + between image, label, bounding box, keypoints, and segmentation + mask. Returns: output 4D tensor, which will be forward to `layer.call()`. @@ -141,10 +145,11 @@ def augment_labels(self, labels, transformations, **kwargs): """Augment a batch of labels during training. Args: - label: 2D label to the layer. Forwarded from `layer.call()`. + labels: 2D label to the layer. Forwarded from `layer.call()`. transformations: The transformations object produced by `get_random_transformations`. Used to coordinate the randomness - between image, label, bounding box, keypoints, and segmentation mask. + between image, label, bounding box, keypoints, and segmentation + mask. Returns: output 2D tensor, which will be forward to `layer.call()`. @@ -155,10 +160,11 @@ def augment_targets(self, targets, transformations, **kwargs): """Augment a batch of targets during training. Args: - target: 2D label to the layer. Forwarded from `layer.call()`. + targets: 2D label to the layer. Forwarded from `layer.call()`. transformations: The transformations object produced by `get_random_transformations`. Used to coordinate the randomness - between image, label, bounding box, keypoints, and segmentation mask. + between image, label, bounding box, keypoints, and segmentation + mask. Returns: output 2D tensor, which will be forward to `layer.call()`. @@ -173,7 +179,8 @@ def augment_bounding_boxes(self, bounding_boxes, transformations, **kwargs): `call()`. transformations: The transformations object produced by `get_random_transformations`. Used to coordinate the randomness - between image, label, bounding box, keypoints, and segmentation mask. + between image, label, bounding box, keypoints, and segmentation + mask. Returns: output 3D tensor, which will be forward to `layer.call()`. @@ -185,11 +192,12 @@ def augment_keypoints(self, keypoints, transformations, **kwargs): Args: keypoints: 3D keypoints input tensor to the layer. Forwarded from - `layer.call()`. Shape should be [batch, num_keypoints, 2] in the specified - keypoint format. + `layer.call()`. Shape should be [batch, num_keypoints, 2] in the + specified keypoint format. transformations: The transformations object produced by `get_random_transformations`. Used to coordinate the randomness - between image, label, bounding box, keypoints, and segmentation mask. + between image, label, bounding box, keypoints, and segmentation + mask. Returns: output 3D tensor, which will be forward to `layer.call()`. @@ -202,15 +210,17 @@ def augment_segmentation_masks( """Augment a batch of images' segmentation masks during training. Args: - segmentation_mask: 4D segmentation mask input tensor to the layer. + segmentation_masks: 4D segmentation mask input tensor to the layer. This should generally have the shape [B, H, W, 1], or in some cases [B, H, W, C] for multilabeled data. Forwarded from `layer.call()`. transformations: The transformations object produced by `get_random_transformations`. Used to coordinate the randomness - between image, label, bounding box, keypoints, and segmentation mask. + between image, label, bounding box, keypoints, and segmentation + mask. Returns: - output 4D tensor containing the augmented segmentation mask, which will be forward to `layer.call()`. + output 4D tensor containing the augmented segmentation mask, which + will be forward to `layer.call()`. """ raise NotImplementedError() @@ -230,10 +240,10 @@ def get_random_transformation_batch( Args: batch_size: the batch size of transformations configuration to sample. - image: 3D image tensor from inputs. - label: optional 1D label tensor from inputs. - bounding_box: optional 2D bounding boxes tensor from inputs. - segmentation_mask: optional 3D segmentation mask tensor from inputs. + images: 3D image tensor from inputs. + labels: optional 1D label tensor from inputs. + bounding_boxes: optional 2D bounding boxes tensor from inputs. + segmentation_masks: optional 3D segmentation mask tensor from inputs. Returns: Any type of object, which will be forwarded to `augment_images`, @@ -382,7 +392,8 @@ def _format_inputs(self, inputs): if not isinstance(inputs, dict): raise ValueError( - f"Expect the inputs to be image tensor or dict. Got inputs={inputs}" + "Expect the inputs to be image tensor or dict. Got " + f"inputs={inputs}" ) if BOUNDING_BOXES in inputs: @@ -441,12 +452,13 @@ def _ensure_inputs_are_compute_dtype(self, inputs): return inputs def _format_bounding_boxes(self, bounding_boxes): - # We can't catch the case where this is None, sometimes RaggedTensor drops this - # dimension + # We can't catch the case where this is None, sometimes RaggedTensor + # drops this dimension. if "classes" not in bounding_boxes: raise ValueError( - "Bounding boxes are missing class_id. If you would like to pad the " - "bounding boxes with class_id, use: " - "`bounding_boxes['classes'] = tf.ones_like(bounding_boxes['boxes'])`." + "Bounding boxes are missing class_id. If you would like to pad " + "the bounding boxes with class_id, use: " + "`bounding_boxes['classes'] = " + "tf.ones_like(bounding_boxes['boxes'])`." ) return bounding_boxes diff --git a/keras_cv/layers/preprocessing/vectorized_base_image_augmentation_layer_test.py b/keras_cv/layers/preprocessing/vectorized_base_image_augmentation_layer_test.py index 193c186b46..30e2758325 100644 --- a/keras_cv/layers/preprocessing/vectorized_base_image_augmentation_layer_test.py +++ b/keras_cv/layers/preprocessing/vectorized_base_image_augmentation_layer_test.py @@ -15,7 +15,7 @@ import tensorflow as tf from keras_cv import bounding_box -from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( +from keras_cv.layers.preprocessing.vectorized_base_image_augmentation_layer import ( # noqa: E501 VectorizedBaseImageAugmentationLayer, ) diff --git a/keras_cv/layers/preprocessing/with_mixed_precision_test.py b/keras_cv/layers/preprocessing/with_mixed_precision_test.py index fc8fe3d9e0..f6b21a71a2 100644 --- a/keras_cv/layers/preprocessing/with_mixed_precision_test.py +++ b/keras_cv/layers/preprocessing/with_mixed_precision_test.py @@ -160,9 +160,9 @@ def test_can_run_in_mixed_precision(self, layer_cls, init_args): if not tf.config.list_physical_devices("GPU"): if layer_cls in NO_CPU_FP16_KERNEL_LAYERS: self.skipTest( - "There is currently no float16 CPU kernel registered for operations" - " `tf.image.adjust_saturation`, and `tf.image.adjust_hue`. " - "Skipping." + "There is currently no float16 CPU kernel registered for " + "operations `tf.image.adjust_saturation`, and " + "`tf.image.adjust_hue`. Skipping." ) keras.mixed_precision.set_global_policy("mixed_float16") diff --git a/keras_cv/layers/preprocessing_3d/__init__.py b/keras_cv/layers/preprocessing_3d/__init__.py index e18e543168..652c0e3a9d 100644 --- a/keras_cv/layers/preprocessing_3d/__init__.py +++ b/keras_cv/layers/preprocessing_3d/__init__.py @@ -18,7 +18,7 @@ from keras_cv.layers.preprocessing_3d.frustum_random_dropping_points import ( FrustumRandomDroppingPoints, ) -from keras_cv.layers.preprocessing_3d.frustum_random_point_feature_noise import ( +from keras_cv.layers.preprocessing_3d.frustum_random_point_feature_noise import ( # noqa: E501 FrustumRandomPointFeatureNoise, ) from keras_cv.layers.preprocessing_3d.global_random_dropping_points import ( diff --git a/keras_cv/layers/preprocessing_3d/base_augmentation_layer_3d.py b/keras_cv/layers/preprocessing_3d/base_augmentation_layer_3d.py index cc309d7da0..57364f2e68 100644 --- a/keras_cv/layers/preprocessing_3d/base_augmentation_layer_3d.py +++ b/keras_cv/layers/preprocessing_3d/base_augmentation_layer_3d.py @@ -28,26 +28,26 @@ @keras.utils.register_keras_serializable(package="keras_cv") class BaseAugmentationLayer3D(keras.__internal__.layers.BaseRandomLayer): - """Abstract base layer for data augmentaion for 3D preception. + """Abstract base layer for data augmentation for 3D perception. This layer contains base functionalities for preprocessing layers which - augment 3D preception related data, eg. point_clouds and in future, images. - The subclasses could avoid making certain mistakes and reduce code + augment 3D perception related data, e.g. point_clouds and in the future, + images. The subclasses could avoid making certain mistakes and reduce code duplications. This layer requires you to implement one method: `augment_point_clouds()`, - which augments one or a sequence of point clouds during the training. There are a few - additional methods that you can implement for added functionality on the - layer: + which augments one or a sequence of point clouds during the training. There + are a few additional methods that you can implement for added functionality + on the layer: `augment_bounding_boxes()`, which handles the bounding box augmentation, if the layer supports that. `get_random_transformation()`, which should produce a random transformation - setting. The tranformation object, which could be any type, will be passed - to `augment_point_clouds` and `augment_bounding_boxes`, to - coodinate the randomness behavior, eg, in the RotateZ layer, the point_clouds - and bounding_boxes should be changed in the same way. + setting. The transformation object, which could be any type, will be passed + to `augment_point_clouds` and `augment_bounding_boxes`, to coordinate the + randomness behavior, eg, in the RotateZ layer, the point_clouds and + bounding_boxes should be changed in the same way. The `call()` method support two formats of inputs: 1. A dict of tensors with stable keys. The supported keys are: @@ -61,10 +61,10 @@ class BaseAugmentationLayer3D(keras.__internal__.layers.BaseRandomLayer): unpack the inputs, forward to the correct function, and pack the output back to the same structure as the inputs. - By default the `call()` method leverages the `tf.vectorized_map()` function. - Auto-vectorization can be disabled by setting `self.auto_vectorize = False` - in your `__init__()` method. When disabled, `call()` instead relies - on `tf.map_fn()`. For example: + By default, the `call()` method leverages the `tf.vectorized_map()` + function. Auto-vectorization can be disabled by setting + `self.auto_vectorize = False` in your `__init__()` method. When disabled, + `call()` instead relies on `tf.map_fn()`. For example: ```python class SubclassLayer(keras_cv.BaseImageAugmentationLayer): @@ -90,9 +90,9 @@ def augment_pointclouds(self, point_clouds, transformation): return pointcloud, boxes ``` - Note that since the randomness is also a common functionnality, this layer + Note that since the randomness is also a common functionality, this layer also includes a keras.backend.RandomGenerator, which can be used to - produce the random numbers. The random number generator is stored in the + produce the random numbers. The random number generator is stored in the `self._random_generator` attribute. """ @@ -104,9 +104,9 @@ def __init__(self, seed=None, **kwargs): def auto_vectorize(self): """Control whether automatic vectorization occurs. - By default the `call()` method leverages the `tf.vectorized_map()` - function. Auto-vectorization can be disabled by setting - `self.auto_vectorize = False` in your `__init__()` method. When + By default, the `call()` method leverages the `tf.vectorized_map()` + function. Auto-vectorization can be disabled by setting + `self.auto_vectorize = False` in your `__init__()` method. When disabled, `call()` instead relies on `tf.map_fn()`. For example: ```python @@ -156,7 +156,7 @@ def get_random_transformation(self, point_clouds=None, bounding_boxes=None): Args: point_clouds: 3D point clouds tensor from inputs. - bounding_box: 3D bounding boxes tensor from inputs. + bounding_boxes: 3D bounding boxes tensor from inputs. Returns: Any type of object, which will be forwarded to `augment_point_clouds`, @@ -176,8 +176,10 @@ def call(self, inputs, training=True): return self._batch_augment(inputs) else: raise ValueError( - "Point clouds augmentation layers are expecting inputs point clouds and bounding boxes to " - "be rank 3D (Frame, Point, Feature) or 4D (Batch, Frame, Point, Feature) tensors. Got shape: {} and {}".format( + "Point clouds augmentation layers are expecting inputs " + "point clouds and bounding boxes to be rank 3D (Frame, " + "Point, Feature) or 4D (Batch, Frame, Point, Feature) " + "tensors. Got shape: {} and {}".format( point_clouds.shape, bounding_boxes.shape ) ) diff --git a/keras_cv/layers/preprocessing_3d/frustum_random_dropping_points.py b/keras_cv/layers/preprocessing_3d/frustum_random_dropping_points.py index 19393dd653..f737a29354 100644 --- a/keras_cv/layers/preprocessing_3d/frustum_random_dropping_points.py +++ b/keras_cv/layers/preprocessing_3d/frustum_random_dropping_points.py @@ -27,13 +27,16 @@ class FrustumRandomDroppingPoints( base_augmentation_layer_3d.BaseAugmentationLayer3D ): - """A preprocessing layer which randomly drops point within a randomly generated frustum during training. + """A preprocessing layer which randomly drops point within a randomly + generated frustum during training. - This layer will randomly select a point from the point cloud as the center of a frustum then generate a frustum based - on r_distance, theta_width, and phi_width. Points inside the selected frustum are randomly dropped (setting all features to zero) - based on drop_rate. - The point_clouds tensor shape must be specific and cannot be dynamic. - During inference time, the output will be identical to input. Call the layer with `training=True` to drop the input points. + This layer will randomly select a point from the point cloud as the center + of a frustum then generate a frustum based on r_distance, theta_width, and + phi_width. Points inside the selected frustum are randomly dropped + (setting all features to zero) based on drop_rate. The point_clouds tensor + shape must be specific and cannot be dynamic. During inference time, the + output will be identical to input. Call the layer with `training=True` to + drop the input points. Input shape: point_clouds: 3D (multi frames) float32 Tensor with shape @@ -50,8 +53,10 @@ class FrustumRandomDroppingPoints( r_distance: A float scalar sets the starting distance of a frustum. theta_width: A float scalar sets the theta width of a frustum. phi_width: A float scalar sets the phi width of a frustum. - drop_rate: A float scalar sets the probability threshold for dropping the points. - exclude_classes: An optional int scalar or a list of ints. Points with the specified class(es) will not be dropped. + drop_rate: A float scalar sets the probability threshold for dropping the + points. + exclude_classes: An optional int scalar or a list of ints. Points with the + specified class(es) will not be dropped. """ @@ -104,7 +109,8 @@ def get_config(self): } def get_random_transformation(self, point_clouds, **kwargs): - # Randomly select a point from the first frame as the center of the frustum. + # Randomly select a point from the first frame as the center of the + # frustum. valid_points = point_clouds[0, :, POINTCLOUD_LABEL_INDEX] > 0 num_valid_points = tf.math.reduce_sum(tf.cast(valid_points, tf.int32)) randomly_select_point_index = tf.random.uniform( diff --git a/keras_cv/layers/preprocessing_3d/frustum_random_dropping_points_test.py b/keras_cv/layers/preprocessing_3d/frustum_random_dropping_points_test.py index d83c6e2236..7c12c1a8d9 100644 --- a/keras_cv/layers/preprocessing_3d/frustum_random_dropping_points_test.py +++ b/keras_cv/layers/preprocessing_3d/frustum_random_dropping_points_test.py @@ -44,7 +44,7 @@ def test_not_augment_drop_rate0_point_clouds_and_bounding_boxes(self): outputs = add_layer(inputs) self.assertAllClose(inputs, outputs) - def test_not_augment_drop_rate1_frustum_empty_point_clouds_and_bounding_boxes( + def test_not_augment_drop_rate1_frustum_empty_point_clouds_and_bounding_boxes( # noqa: E501 self, ): add_layer = FrustumRandomDroppingPoints( diff --git a/keras_cv/layers/preprocessing_3d/frustum_random_point_feature_noise.py b/keras_cv/layers/preprocessing_3d/frustum_random_point_feature_noise.py index 4f7170735e..7fc1711195 100644 --- a/keras_cv/layers/preprocessing_3d/frustum_random_point_feature_noise.py +++ b/keras_cv/layers/preprocessing_3d/frustum_random_point_feature_noise.py @@ -28,13 +28,17 @@ class FrustumRandomPointFeatureNoise( base_augmentation_layer_3d.BaseAugmentationLayer3D ): - """A preprocessing layer which randomly add noise to point features within a randomly generated frustum during training. - - This layer will randomly select a point from the point cloud as the center of a frustum then generate a frustum based - on r_distance, theta_width, and phi_width. Uniformly sampled features noise from [1-max_noise_level, 1+max_noise_level] will be multiplied - to points inside the selected frustum. Here, we perturbe point features other than (x, y, z, class). - The point_clouds tensor shape must be specific and cannot be dynamic. - During inference time, the output will be identical to input. Call the layer with `training=True` to add noise to the input points. + """A preprocessing layer which randomly add noise to point features within a + randomly generated frustum during training. + + This layer will randomly select a point from the point cloud as the center + of a frustum then generate a frustum based on r_distance, theta_width, and + phi_width. Uniformly sampled features noise from [1-max_noise_level, + 1+max_noise_level] will be multiplied to points inside the selected frustum. + Here, we perturbe point features other than (x, y, z, class). The + point_clouds tensor shape must be specific and cannot be dynamic. During + inference time, the output will be identical to input. Call the layer with + `training=True` to add noise to the input points. Input shape: point_clouds: 3D (multi frames) float32 Tensor with shape @@ -53,8 +57,10 @@ class FrustumRandomPointFeatureNoise( r_distance: A float scalar sets the starting distance of a frustum. theta_width: A float scalar sets the theta width of a frustum. phi_width: A float scalar sets the phi width of a frustum. - max_noise_level: A float scalar sets the sampled feature noise range [1-max_noise_level, 1+max_noise_level]. - exclude_classes: An optional int scalar or a list of ints. Points with the specified class(es) will not be modified. + max_noise_level: A float scalar sets the sampled feature noise range + [1-max_noise_level, 1+max_noise_level]. + exclude_classes: An optional int scalar or a list of ints. Points with the + specified class(es) will not be modified. """ @@ -98,7 +104,8 @@ def get_config(self): } def get_random_transformation(self, point_clouds, **kwargs): - # Randomly select a point from the first frame as the center of the frustum. + # Randomly select a point from the first frame as the center of the + # frustum. valid_points = point_clouds[0, :, POINTCLOUD_LABEL_INDEX] > 0 num_valid_points = tf.math.reduce_sum(tf.cast(valid_points, tf.int32)) randomly_select_point_index = tf.random.uniform( @@ -149,8 +156,8 @@ def augment_point_clouds_bounding_boxes( ): point_noise = transformation["point_noise"] - # Do not add noise to points that are protected by setting the corresponding - # point_noise = 1.0. + # Do not add noise to points that are protected by setting the + # corresponding point_noise = 1.0. protected_points = tf.zeros_like(point_clouds[..., -1], dtype=tf.bool) for excluded_class in self._exclude_classes: protected_points |= point_clouds[..., -1] == excluded_class diff --git a/keras_cv/layers/preprocessing_3d/frustum_random_point_feature_noise_test.py b/keras_cv/layers/preprocessing_3d/frustum_random_point_feature_noise_test.py index d4943abacd..5aebe3df3f 100644 --- a/keras_cv/layers/preprocessing_3d/frustum_random_point_feature_noise_test.py +++ b/keras_cv/layers/preprocessing_3d/frustum_random_point_feature_noise_test.py @@ -17,7 +17,7 @@ from tensorflow import keras from keras_cv.layers.preprocessing_3d import base_augmentation_layer_3d -from keras_cv.layers.preprocessing_3d.frustum_random_point_feature_noise import ( +from keras_cv.layers.preprocessing_3d.frustum_random_point_feature_noise import ( # noqa: E501 FrustumRandomPointFeatureNoise, ) @@ -84,8 +84,9 @@ def test_augment_specific_point_clouds_and_bounding_boxes(self): ).astype("float32") self.assertAllClose(inputs[BOUNDING_BOXES], outputs[BOUNDING_BOXES]) # [-20, -20, 21, 1, 0, 2] is randomly selected as the frustum center. - # [0, 1, 2, 3, 4, 5] and [10, 1, 2, 3, 4, 2] are not changed due to less than r_distance. - # [100, 100, 2, 3, 4, 1] is not changed due to outside phi_width. + # [0, 1, 2, 3, 4, 5] and [10, 1, 2, 3, 4, 2] are not changed due to less + # than r_distance. [100, 100, 2, 3, 4, 1] is not changed due to outside + # phi_width. self.assertAllClose(outputs[POINT_CLOUDS], augmented_point_clouds) def test_augment_only_one_valid_point_point_clouds_and_bounding_boxes(self): @@ -128,7 +129,8 @@ def test_augment_only_one_valid_point_point_clouds_and_bounding_boxes(self): ] ).astype("float32") self.assertAllClose(inputs[BOUNDING_BOXES], outputs[BOUNDING_BOXES]) - # [100, 100, 2, 3, 4, 1] is selected as the frustum center because it is the only valid point. + # [100, 100, 2, 3, 4, 1] is selected as the frustum center because it is + # the only valid point. self.assertAllClose(outputs[POINT_CLOUDS], augmented_point_clouds) def test_not_augment_max_noise_level0_point_clouds_and_bounding_boxes(self): @@ -141,7 +143,7 @@ def test_not_augment_max_noise_level0_point_clouds_and_bounding_boxes(self): outputs = add_layer(inputs) self.assertAllClose(inputs, outputs) - def test_not_augment_max_noise_level1_frustum_empty_point_clouds_and_bounding_boxes( + def test_not_augment_max_noise_level1_frustum_empty_point_clouds_and_bounding_boxes( # noqa: E501 self, ): add_layer = FrustumRandomPointFeatureNoise( diff --git a/keras_cv/layers/preprocessing_3d/global_random_dropping_points.py b/keras_cv/layers/preprocessing_3d/global_random_dropping_points.py index 370bff60b9..1a5382cc88 100644 --- a/keras_cv/layers/preprocessing_3d/global_random_dropping_points.py +++ b/keras_cv/layers/preprocessing_3d/global_random_dropping_points.py @@ -28,7 +28,8 @@ class GlobalRandomDroppingPoints( """A preprocessing layer which randomly drops point during training. This layer will randomly drop points based on keep_probability. - During inference time, the output will be identical to input. Call the layer with `training=True` to drop the input points. + During inference time, the output will be identical to input. Call the layer + with `training=True` to drop the input points. Input shape: point_clouds: 3D (multi frames) float32 Tensor with shape @@ -44,8 +45,10 @@ class GlobalRandomDroppingPoints( A dictionary of Tensors with the same shape as input Tensors. Arguments: - drop_rate: A float scalar sets the probability threshold for dropping the points. - exclude_classes: An optional int scalar or a list of ints. Points with the specified class(es) will not be dropped. + drop_rate: A float scalar sets the probability threshold for dropping the + points. + exclude_classes: An optional int scalar or a list of ints. Points with the + specified class(es) will not be dropped. """ @@ -85,8 +88,8 @@ def augment_point_clouds_bounding_boxes( ): point_mask = transformation["point_mask"] - # Do not add noise to points that are protected by setting the corresponding - # point_noise = 1.0. + # Do not add noise to points that are protected by setting the + # corresponding point_noise = 1.0. protected_points = tf.zeros_like(point_clouds[0, :, -1], dtype=tf.bool) for excluded_class in self._exclude_classes: protected_points |= point_clouds[0, :, -1] == excluded_class diff --git a/keras_cv/layers/preprocessing_3d/global_random_dropping_points_test.py b/keras_cv/layers/preprocessing_3d/global_random_dropping_points_test.py index 2d7c9e9efe..b46280beca 100644 --- a/keras_cv/layers/preprocessing_3d/global_random_dropping_points_test.py +++ b/keras_cv/layers/preprocessing_3d/global_random_dropping_points_test.py @@ -42,8 +42,8 @@ def test_specific_augment_point_clouds_and_bounding_boxes(self): inputs = {POINT_CLOUDS: point_clouds, BOUNDING_BOXES: bounding_boxes} outputs = add_layer(inputs) self.assertNotAllClose(inputs, outputs) - # The augmented point clouds in the first frame should be the same as the - # augmented point clouds in the second frame. + # The augmented point clouds in the first frame should be the same as + # the augmented point clouds in the second frame. self.assertAllClose(outputs[POINT_CLOUDS][0], outputs[POINT_CLOUDS][1]) def test_not_augment_point_clouds_and_bounding_boxes(self): diff --git a/keras_cv/layers/preprocessing_3d/global_random_flip.py b/keras_cv/layers/preprocessing_3d/global_random_flip.py index 53e3dbeded..caf42544be 100644 --- a/keras_cv/layers/preprocessing_3d/global_random_flip.py +++ b/keras_cv/layers/preprocessing_3d/global_random_flip.py @@ -25,11 +25,13 @@ @keras.utils.register_keras_serializable(package="keras_cv") class GlobalRandomFlip(base_augmentation_layer_3d.BaseAugmentationLayer3D): - """A preprocessing layer which flips point clouds and bounding boxes with respect to the specified axis during training. + """A preprocessing layer which flips point clouds and bounding boxes with + respect to the specified axis during training. This layer will flip the whole scene with respect to the specified axes. Note that this layer currently only supports flipping over the Y axis. - During inference time, the output will be identical to input. Call the layer with `training=True` to flip the input. + During inference time, the output will be identical to input. Call the layer + with `training=True` to flip the input. Input shape: point_clouds: 3D (multi frames) float32 Tensor with shape @@ -45,16 +47,17 @@ class GlobalRandomFlip(base_augmentation_layer_3d.BaseAugmentationLayer3D): A dictionary of Tensors with the same shape as input Tensors. Args: - flip_x: Whether or not to flip over the X axis. Defaults to False. - flip_y: Whether or not to flip over the Y axis. Defaults to True. - flip_z: Whether or not to flip over the Z axis. Defaults to False. + flip_x: whether to flip over the X axis, defaults to False. + flip_y: whether to flip over the Y axis, defaults to True. + flip_z: whether to flip over the Z axis, defaults to False. """ def __init__(self, flip_x=False, flip_y=True, flip_z=False, **kwargs): if flip_x or flip_z: raise ValueError( "GlobalRandomFlip currently only supports flipping over the Y " - f"axis. Received flip_x={flip_x}, flip_y={flip_y}, flip_z={flip_z}." + f"axis. Received flip_x={flip_x}, flip_y={flip_y}, " + f"flip_z={flip_z}." ) if not (flip_x or flip_y or flip_z): diff --git a/keras_cv/layers/preprocessing_3d/global_random_rotation.py b/keras_cv/layers/preprocessing_3d/global_random_rotation.py index 2b40090964..8a834296c4 100644 --- a/keras_cv/layers/preprocessing_3d/global_random_rotation.py +++ b/keras_cv/layers/preprocessing_3d/global_random_rotation.py @@ -26,12 +26,14 @@ @keras.utils.register_keras_serializable(package="keras_cv") class GlobalRandomRotation(base_augmentation_layer_3d.BaseAugmentationLayer3D): - """A preprocessing layer which randomly rotates point clouds and bounding boxes along - X, Y and Z axes during training. + """A preprocessing layer which randomly rotates point clouds and bounding + boxes along X, Y and Z axes during training. - This layer will randomly rotate the whole scene along the X, Y and Z axes based on a randomly sampled - rotation angle between [-max_rotation_angle, max_rotation_angle] (in radians) following a uniform distribution. - During inference time, the output will be identical to input. Call the layer with `training=True` to rotate the input. + This layer will randomly rotate the whole scene along the X, Y and Z axes + based on a randomly sampled rotation angle between [-max_rotation_angle, + max_rotation_angle] (in radians) following a uniform distribution. During + inference time, the output will be identical to input. Call the layer with + `training=True` to rotate the input. Input shape: point_clouds: 3D (multi frames) float32 Tensor with shape @@ -47,9 +49,12 @@ class GlobalRandomRotation(base_augmentation_layer_3d.BaseAugmentationLayer3D): A dictionary of Tensors with the same shape as input Tensors. Arguments: - max_rotation_angle_x: A float scalar sets the maximum rotation angle (in radians) along X axis. - max_rotation_angle_y: A float scalar sets the maximum rotation angle (in radians) along Y axis. - max_rotation_angle_z: A float scalar sets the maximum rotation angle (in radians) along Z axis. + max_rotation_angle_x: A float scalar sets the maximum rotation angle (in + radians) along X axis. + max_rotation_angle_y: A float scalar sets the maximum rotation angle (in + radians) along Y axis. + max_rotation_angle_z: A float scalar sets the maximum rotation angle (in + radians) along Z axis. """ diff --git a/keras_cv/layers/preprocessing_3d/global_random_scaling.py b/keras_cv/layers/preprocessing_3d/global_random_scaling.py index 73408405ca..7fec04ad6b 100644 --- a/keras_cv/layers/preprocessing_3d/global_random_scaling.py +++ b/keras_cv/layers/preprocessing_3d/global_random_scaling.py @@ -24,18 +24,20 @@ @keras.utils.register_keras_serializable(package="keras_cv") class GlobalRandomScaling(base_augmentation_layer_3d.BaseAugmentationLayer3D): - """A preprocessing layer which randomly scales point clouds and bounding boxes along - X, Y, and Z axes during training. + """A preprocessing layer which randomly scales point clouds and bounding + boxes along X, Y, and Z axes during training. - This layer will randomly scale the whole scene along the X, Y, and Z axes based on a randomly sampled - scaling factor between [min_scaling_factor, max_scaling_factor] following a uniform distribution. - During inference time, the output will be identical to input. Call the layer with `training=True` to scale the input. + This layer will randomly scale the whole scene along the X, Y, and Z axes + based on a randomly sampled scaling factor between [min_scaling_factor, + max_scaling_factor] following a uniform distribution. During inference time, + the output will be identical to input. Call the layer with `training=True` + to scale the input. Input shape: point_clouds: 3D (multi frames) float32 Tensor with shape [num of frames, num of points, num of point features]. The first 5 features are [x, y, z, class, range]. - bounding_boxes: 3D (multi frames) float32 Tensor with shape + bounding_boxes: 3D (multi frames) float32 Tensor with shape [num of frames, num of boxes, num of box features]. Boxes are expected to follow the CENTER_XYZ_DXDYDZ_PHI format. Refer to https://github.com/keras-team/keras-cv/blob/master/keras_cv/bounding_box_3d/formats.py @@ -45,9 +47,12 @@ class GlobalRandomScaling(base_augmentation_layer_3d.BaseAugmentationLayer3D): A dictionary of Tensors with the same shape as input Tensors. Arguments: - x_factor: A tuple of float scalars or a float scalar sets the minimum and maximum scaling factors for the X axis. - y_factor: A tuple of float scalars or a float scalar sets the minimum and maximum scaling factors for the Y axis. - z_factor: A tuple of float scalars or a float scalar sets the minimum and maximum scaling factors for the Z axis. + x_factor: A tuple of float scalars or a float scalar sets the minimum and + maximum scaling factors for the X axis. + y_factor: A tuple of float scalars or a float scalar sets the minimum and + maximum scaling factors for the Y axis. + z_factor: A tuple of float scalars or a float scalar sets the minimum and + maximum scaling factors for the Z axis. """ def __init__( @@ -105,11 +110,13 @@ def __init__( if preserve_aspect_ratio: if min_x_factor != min_y_factor or min_y_factor != min_z_factor: raise ValueError( - "min_factor must be the same when preserve_aspect_ratio is true." + "min_factor must be the same when preserve_aspect_ratio is " + "true." ) if max_x_factor != max_y_factor or max_y_factor != max_z_factor: raise ValueError( - "max_factor must be the same when preserve_aspect_ratio is true." + "max_factor must be the same when preserve_aspect_ratio is " + "true." ) self._min_x_factor = min_x_factor diff --git a/keras_cv/layers/preprocessing_3d/global_random_translation.py b/keras_cv/layers/preprocessing_3d/global_random_translation.py index 06ce876c62..344ff7c1e9 100644 --- a/keras_cv/layers/preprocessing_3d/global_random_translation.py +++ b/keras_cv/layers/preprocessing_3d/global_random_translation.py @@ -27,12 +27,14 @@ class GlobalRandomTranslation( base_augmentation_layer_3d.BaseAugmentationLayer3D ): - """A preprocessing layer which randomly translates point clouds and bounding boxes along - X, Y, and Z axes during training. + """A preprocessing layer which randomly translates point clouds and bounding + boxes along X, Y, and Z axes during training. - This layer will randomly translate the whole scene along the X, Y,and Z axes based on three randomly sampled - translation factors following three normal distributions centered at 0 with standard deviation [x_stddev, y_stddev, z_stddev]. - During inference time, the output will be identical to input. Call the layer with `training=True` to translate the input. + This layer will randomly translate the whole scene along the X, Y,and Z axes + based on three randomly sampled translation factors following three normal + distributions centered at 0 with standard deviation [x_stddev, y_stddev, + z_stddev]. During inference time, the output will be identical to input. + Call the layer with `training=True` to translate the input. Input shape: point_clouds: 3D (multi frames) float32 Tensor with shape @@ -48,9 +50,12 @@ class GlobalRandomTranslation( A dictionary of Tensors with the same shape as input Tensors. Arguments: - x_stddev: A float scalar sets the translation noise standard deviation along the X axis. - y_stddev: A float scalar sets the translation noise standard deviation along the Y axis. - z_stddev: A float scalar sets the translation noise standard deviation along the Z axis. + x_stddev: A float scalar sets the translation noise standard deviation + along the X axis. + y_stddev: A float scalar sets the translation noise standard deviation + along the Y axis. + z_stddev: A float scalar sets the translation noise standard deviation + along the Z axis. """ def __init__(self, x_stddev=None, y_stddev=None, z_stddev=None, **kwargs): diff --git a/keras_cv/layers/preprocessing_3d/group_points_by_bounding_boxes.py b/keras_cv/layers/preprocessing_3d/group_points_by_bounding_boxes.py index c09b3b548c..9232720dd0 100644 --- a/keras_cv/layers/preprocessing_3d/group_points_by_bounding_boxes.py +++ b/keras_cv/layers/preprocessing_3d/group_points_by_bounding_boxes.py @@ -30,10 +30,13 @@ class GroupPointsByBoundingBoxes( base_augmentation_layer_3d.BaseAugmentationLayer3D ): - """A preprocessing layer which groups point clouds based on bounding boxes during training. + """A preprocessing layer which groups point clouds based on bounding boxes + during training. - This layer will group point clouds based on bounding boxes and generate OBJECT_POINT_CLOUDS and OBJECT_BOUNDING_BOXES tensors. - During inference time, the output will be identical to input. Call the layer with `training=True` to group point clouds based on bounding boxes. + This layer will group point clouds based on bounding boxes and generate + OBJECT_POINT_CLOUDS and OBJECT_BOUNDING_BOXES tensors. + During inference time, the output will be identical to input. Call the layer + with `training=True` to group point clouds based on bounding boxes. Input shape: point_clouds: 3D (multi frames) float32 Tensor with shape @@ -45,18 +48,25 @@ class GroupPointsByBoundingBoxes( https://github.com/keras-team/keras-cv/blob/master/keras_cv/bounding_box_3d/formats.py Output shape: - A dictionary of Tensors with the same shape as input Tensors and two additional items for - OBJECT_POINT_CLOUDS (shape [num of frames, num of valid boxes, max num of points, num of point features]) - and OBJECT_BOUNDING_BOXES (shape [num of frames, num of valid boxes, num of box features]). + A dictionary of Tensors with the same shape as input Tensors and two + additional items for OBJECT_POINT_CLOUDS (shape [num of frames, num of + valid boxes, max num of points, num of point features]) and + OBJECT_BOUNDING_BOXES (shape [num of frames, num of valid boxes, num of + box features]). Arguments: label_index: An optional int scalar sets the target object index. - Bounding boxes and corresponding point clouds with box class == label_index will be saved as OBJECT_BOUNDING_BOXES and OBJECT_POINT_CLOUDS. - If label index is None, all valid bounding boxes (box class !=0) are used. - min_points_per_bounding_boxes: A int scalar sets the min number of points in a bounding box. - If a bounding box contains less than min_points_per_bounding_boxes, the bounding box is filtered out. - max_points_per_bounding_boxes: A int scalar sets the max number of points in a bounding box. - All the object point clouds will be padded or trimmed to the same shape, where the number of points dimension is max_points_per_bounding_boxes. + Bounding boxes and corresponding point clouds with box class == + label_index will be saved as OBJECT_BOUNDING_BOXES and + OBJECT_POINT_CLOUDS. If label index is None, all valid bounding boxes + (box class !=0) are used. + min_points_per_bounding_boxes: A int scalar sets the min number of points + in a bounding box. If a bounding box contains less than + min_points_per_bounding_boxes, the bounding box is filtered out. + max_points_per_bounding_boxes: A int scalar sets the max number of points + in a bounding box. All the object point clouds will be padded or trimmed + to the same shape, where the number of points dimension is + max_points_per_bounding_boxes. """ def __init__( @@ -76,7 +86,8 @@ def __init__( raise ValueError("max_points_per_bounding_boxes must be >=0.") if min_points_per_bounding_boxes > max_points_per_bounding_boxes: raise ValueError( - "max_paste_bounding_boxes must be >= min_points_per_bounding_boxes." + "max_paste_bounding_boxes must be >= " + "min_points_per_bounding_boxes." ) self._label_index = label_index @@ -87,8 +98,8 @@ def __init__( def get_config(self): return { "label_index": self._label_index, - "min_points_per_bounding_boxes": self._min_points_per_bounding_boxes, - "max_points_per_bounding_boxes": self._max_points_per_bounding_boxes, + "min_points_per_bounding_boxes": self._min_points_per_bounding_boxes, # noqa: E501 + "max_points_per_bounding_boxes": self._max_points_per_bounding_boxes, # noqa: E501 } def augment_point_clouds_bounding_boxes( @@ -140,7 +151,8 @@ def augment_point_clouds_bounding_boxes( sort_valid_mask = tf.gather( points_in_bounding_boxes, sort_valid_index, axis=2, batch_dims=2 )[:, :, : self._max_points_per_bounding_boxes] - # [num of frames, num of boxes, self._max_points_per_bounding_boxes, num of point features]. + # [num of frames, num of boxes, self._max_points_per_bounding_boxes, num + # of point features]. object_point_clouds = point_clouds[:, tf.newaxis, :, :] num_valid_bounding_boxes = tf.shape(object_bounding_boxes)[1] object_point_clouds = tf.tile( @@ -198,7 +210,8 @@ def augment_point_clouds_bounding_boxes_v2( points_in_bounding_boxes, min_points_filter ) # point_clouds: [frames, num_points, point_feature] - # object_point_clouds: [frames, num_valid_boxes, ragged_points, point_feature] + # object_point_clouds: [frames, num_valid_boxes, ragged_points, + # point_feature] object_point_clouds = tf.gather( point_clouds, points_in_bounding_boxes, axis=1, batch_dims=1 ) @@ -252,19 +265,23 @@ def call(self, inputs, training=True): ) object_point_clouds_list += [object_point_clouds] object_bounding_boxes_list += [object_bounding_boxes] - # object_point_clouds shape [num of frames, num of valid boxes, max num of points, num of point features]. + # object_point_clouds shape [num of frames, num of valid boxes, + # max num of points, num of point features]. inputs[OBJECT_POINT_CLOUDS] = tf.concat( object_point_clouds_list, axis=-3 ) - # object_bounding_boxes shape [num of frames, num of valid boxes, num of box features]. + # object_bounding_boxes shape [num of frames, num of valid + # boxes, num of box features]. inputs[OBJECT_BOUNDING_BOXES] = tf.concat( object_bounding_boxes_list, axis=-2 ) return inputs else: raise ValueError( - "Point clouds augmentation layers are expecting inputs point clouds and bounding boxes to " - "be rank 3D (Frame, Point, Feature) or 4D (Batch, Frame, Point, Feature) tensors. Got shape: {} and {}".format( + "Point clouds augmentation layers are expecting inputs " + "point clouds and bounding boxes to be rank 3D (Frame, " + "Point, Feature) or 4D (Batch, Frame, Point, Feature) " + "tensors. Got shape: {} and {}".format( point_clouds.shape, bounding_boxes.shape ) ) diff --git a/keras_cv/layers/preprocessing_3d/group_points_by_bounding_boxes_test.py b/keras_cv/layers/preprocessing_3d/group_points_by_bounding_boxes_test.py index 587f9e2e3f..b465ada7f8 100644 --- a/keras_cv/layers/preprocessing_3d/group_points_by_bounding_boxes_test.py +++ b/keras_cv/layers/preprocessing_3d/group_points_by_bounding_boxes_test.py @@ -77,7 +77,8 @@ def test_augment_point_clouds_and_bounding_boxes(self): self.assertAllClose(inputs[POINT_CLOUDS], outputs[POINT_CLOUDS]) self.assertAllClose(inputs[BOUNDING_BOXES], outputs[BOUNDING_BOXES]) self.assertAllClose(inputs["dummy_item"], outputs["dummy_item"]) - # Sort the point clouds due to the orders of points are different when using Tensorflow and Metal+Tensorflow (MAC). + # Sort the point clouds due to the orders of points are different when + # using Tensorflow and Metal+Tensorflow (MAC). outputs[OBJECT_POINT_CLOUDS] = tf.sort( outputs[OBJECT_POINT_CLOUDS], axis=-2 ) @@ -171,7 +172,8 @@ def test_augment_batch_point_clouds_and_bounding_boxes(self): ).astype("float32") self.assertAllClose(inputs[POINT_CLOUDS], outputs[POINT_CLOUDS]) self.assertAllClose(inputs[BOUNDING_BOXES], outputs[BOUNDING_BOXES]) - # Sort the point clouds due to the orders of points are different when using Tensorflow and Metal+Tensorflow (MAC). + # Sort the point clouds due to the orders of points are different when + # using Tensorflow and Metal+Tensorflow (MAC). outputs[OBJECT_POINT_CLOUDS] = tf.sort( outputs[OBJECT_POINT_CLOUDS], axis=-2 ) diff --git a/keras_cv/layers/preprocessing_3d/random_copy_paste.py b/keras_cv/layers/preprocessing_3d/random_copy_paste.py index 888fbc87ec..31be9c9626 100644 --- a/keras_cv/layers/preprocessing_3d/random_copy_paste.py +++ b/keras_cv/layers/preprocessing_3d/random_copy_paste.py @@ -28,35 +28,46 @@ @keras.utils.register_keras_serializable(package="keras_cv") class RandomCopyPaste(base_augmentation_layer_3d.BaseAugmentationLayer3D): - """A preprocessing layer which randomly pastes object point clouds and bounding boxes during training. + """A preprocessing layer which randomly pastes object point clouds and + bounding boxes during training. This layer will randomly paste object point clouds and bounding boxes. - OBJECT_POINT_CLOUDS and OBJECT_BOUNDING_BOXES are generated by running group_points_by_bounding_boxes function on additional input frames. - We use the first frame to check overlap between existing bounding boxes and pasted bounding boxes - If a to-be-pasted bounding box overlaps with an existing bounding box and object point clouds, we do not paste the additional bounding box. - We load 5 times max_paste_bounding_boxes to check overlap. - If a to-be-pasted bounding box overlaps with existing background point clouds, we paste the additional bounding box and replace the - background point clouds with object point clouds. - During inference time, the output will be identical to input. Call the layer with `training=True` to paste bounding boxes. + OBJECT_POINT_CLOUDS and OBJECT_BOUNDING_BOXES are generated by running + group_points_by_bounding_boxes function on additional input frames. We use + the first frame to check overlap between existing bounding boxes and pasted + bounding boxes + If a to-be-pasted bounding box overlaps with an existing bounding box and + object point clouds, we do not paste the additional bounding box. We load 5 + times max_paste_bounding_boxes to check overlap. + If a to-be-pasted bounding box overlaps with existing background point + clouds, we paste the additional bounding box and replace the background + point clouds with object point clouds. + During inference time, the output will be identical to input. Call the layer + with `training=True` to paste bounding boxes. Input shape: point_clouds: 3D (multi frames) float32 Tensor with shape [num of frames, num of points, num of point features]. The first 5 features are [x, y, z, class, range]. bounding_boxes: 3D (multi frames) float32 Tensor with shape - [num of frames, num of boxes, num of box features]. Boxes are expected + [num of frames, num of boxes, num of box features]. Boxes are expected to follow the CENTER_XYZ_DXDYDZ_PHI format. Refer to https://github.com/keras-team/keras-cv/blob/master/keras_cv/bounding_box_3d/formats.py Output shape: - A tuple of two Tensors (point_clouds, bounding_boxes) with the same shape as input Tensors. + A tuple of two Tensors (point_clouds, bounding_boxes) with the same shape + as input Tensors. Arguments: label_index: An optional int scalar sets the target object index. - Bounding boxes and corresponding point clouds with box class == label_index will be saved as OBJECT_BOUNDING_BOXES and OBJECT_POINT_CLOUDS. - If label index is None, all valid bounding boxes (box class !=0) are used. - min_paste_bounding_boxes: A int scalar sets the min number of pasted bounding boxes. - max_paste_bounding_boxes: A int scalar sets the max number of pasted bounding boxes. + Bounding boxes and corresponding point clouds with box class == + label_index will be saved as OBJECT_BOUNDING_BOXES and + OBJECT_POINT_CLOUDS. If label index is None, all valid bounding boxes + (box class !=0) are used. + min_paste_bounding_boxes: A int scalar sets the min number of pasted + bounding boxes. + max_paste_bounding_boxes: A int scalar sets the max number of pasted + bounding boxes. """ @@ -140,7 +151,8 @@ def get_random_transformation( object_bounding_boxes = object_bounding_boxes[ :, :num_compare_bounding_boxes, : ] - # Use the current frame to check overlap between existing bounding boxes and pasted bounding boxes + # Use the current frame to check overlap between existing bounding boxes + # and pasted bounding boxes all_bounding_boxes = tf.concat( [bounding_boxes, object_bounding_boxes], axis=1 )[0, :, :7] @@ -283,8 +295,10 @@ def call(self, inputs, training=True): return inputs else: raise ValueError( - "Point clouds augmentation layers are expecting inputs point clouds and bounding boxes to " - "be rank 3D (Frame, Point, Feature) or 4D (Batch, Frame, Point, Feature) tensors. Got shape: {} and {}".format( + "Point clouds augmentation layers are expecting inputs " + "point clouds and bounding boxes to be rank 3D (Frame, " + "Point, Feature) or 4D (Batch, Frame, Point, Feature) " + "tensors. Got shape: {} and {}".format( point_clouds.shape, bounding_boxes.shape ) ) diff --git a/keras_cv/layers/preprocessing_3d/random_copy_paste_test.py b/keras_cv/layers/preprocessing_3d/random_copy_paste_test.py index d10a0ef689..651ba9e032 100644 --- a/keras_cv/layers/preprocessing_3d/random_copy_paste_test.py +++ b/keras_cv/layers/preprocessing_3d/random_copy_paste_test.py @@ -92,10 +92,10 @@ def test_augment_point_clouds_and_bounding_boxes(self): OBJECT_BOUNDING_BOXES: object_bounding_boxes, } outputs = add_layer(inputs) - # The first object bounding box [0, 0, 1, 4, 4, 4, 0, 1] overlaps with existing bounding - # box [0, 0, 0, 4, 4, 4, 0, 1], thus not used. - # The second object bounding box [100, 100, 2, 5, 5, 5, 0, 1] and object point clouds - # [100, 101, 2, 3, 4] are pasted. + # The first object bounding box [0, 0, 1, 4, 4, 4, 0, 1] overlaps with + # existing bounding box [0, 0, 0, 4, 4, 4, 0, 1], thus not used. + # The second object bounding box [100, 100, 2, 5, 5, 5, 0, 1] and object + # point clouds [100, 101, 2, 3, 4] are pasted. augmented_point_clouds = np.array( [ [ @@ -199,10 +199,10 @@ def test_augment_batch_point_clouds_and_bounding_boxes(self): OBJECT_BOUNDING_BOXES: object_bounding_boxes, } outputs = add_layer(inputs) - # The first object bounding box [0, 0, 1, 4, 4, 4, 0, 1] overlaps with existing bounding - # box [0, 0, 0, 4, 4, 4, 0, 1], thus not used. - # The second object bounding box [100, 100, 2, 5, 5, 5, 0, 1] and object point clouds - # [100, 101, 2, 3, 4] are pasted. + # The first object bounding box [0, 0, 1, 4, 4, 4, 0, 1] overlaps with + # existing bounding box [0, 0, 0, 4, 4, 4, 0, 1], thus not used. + # The second object bounding box [100, 100, 2, 5, 5, 5, 0, 1] and object + # point clouds [100, 101, 2, 3, 4] are pasted. augmented_point_clouds = np.array( [ [ diff --git a/keras_cv/layers/preprocessing_3d/random_drop_box.py b/keras_cv/layers/preprocessing_3d/random_drop_box.py index 9cf79bdea6..485d1383f2 100644 --- a/keras_cv/layers/preprocessing_3d/random_drop_box.py +++ b/keras_cv/layers/preprocessing_3d/random_drop_box.py @@ -25,13 +25,17 @@ @keras.utils.register_keras_serializable(package="keras_cv") class RandomDropBox(base_augmentation_layer_3d.BaseAugmentationLayer3D): - """A preprocessing layer which randomly drops object bounding boxes and points during training. + """A preprocessing layer which randomly drops object bounding boxes and + points during training. - This layer will randomly drop object point clouds and bounding boxes. Number of dropped bounding boxes - is sampled uniformly sampled between 0 and max_drop_bounding_boxes. If label_index is set, only bounding boxes with - box class == label_index will be sampled and dropped; otherwise, all valid bounding boxes (box class > 0) will be sampled and dropped. + This layer will randomly drop object point clouds and bounding boxes. Number + of dropped bounding boxes is sampled uniformly sampled between 0 and + max_drop_bounding_boxes. If label_index is set, only bounding boxes with box + class == label_index will be sampled and dropped; otherwise, all valid + bounding boxes (box class > 0) will be sampled and dropped. - During inference time, the output will be identical to input. Call the layer with `training=True` to drop object bounding boxes and points. + During inference time, the output will be identical to input. Call the layer + with `training=True` to drop object bounding boxes and points. Input shape: point_clouds: 3D (multi frames) float32 Tensor with shape @@ -43,14 +47,18 @@ class RandomDropBox(base_augmentation_layer_3d.BaseAugmentationLayer3D): Output shape: - A tuple of two Tensors (point_clouds, bounding_boxes) with the same shape as input Tensors. + A tuple of two Tensors (point_clouds, bounding_boxes) with the same shape + as input Tensors. Arguments: - max_drop_bounding_boxes: A int non negative scalar sets the maximum number of dropped bounding boxes. - Do not drop any bounding boxe when max_drop_bounding_boxes = 0. + max_drop_bounding_boxes: A int non-negative scalar sets the maximum number + of dropped bounding boxes. Do not drop any bounding boxes when + max_drop_bounding_boxes = 0. label_index: An optional int scalar sets the target object index. - If label index is set, randomly drop bounding boxes, where box class == label_index. - If label index is None, randomly drop bounding boxes, where box class > 0. + If label index is set, randomly drop bounding boxes, where box + class == label_index. + If label index is None, randomly drop bounding boxes, where box + class > 0. """ diff --git a/keras_cv/layers/preprocessing_3d/random_drop_box_test.py b/keras_cv/layers/preprocessing_3d/random_drop_box_test.py index a8064268db..60f3ddb6e0 100644 --- a/keras_cv/layers/preprocessing_3d/random_drop_box_test.py +++ b/keras_cv/layers/preprocessing_3d/random_drop_box_test.py @@ -65,7 +65,8 @@ def test_drop_class1_box_point_clouds_and_bounding_boxes(self): BOUNDING_BOXES: bounding_boxes, } outputs = add_layer(inputs) - # Drop the first object bounding box [0, 0, 0, 4, 4, 4, 0, 1] and points. + # Drop the first object bounding box [0, 0, 0, 4, 4, 4, 0, 1] and + # points. augmented_point_clouds = np.array( [ [ @@ -272,8 +273,10 @@ def test_batch_drop_one_of_the_box_point_clouds_and_bounding_boxes(self): BOUNDING_BOXES: bounding_boxes, } outputs = add_layer(inputs) - # Batch 0: drop the first bounding box [0, 0, 0, 4, 4, 4, 0, 1] and points, - # Batch 1,2: drop the second bounding box [20, 20, 20, 3, 3, 3, 0, 2] and points, + # Batch 0: drop the first bounding box [0, 0, 0, 4, 4, 4, 0, 1] and + # points, + # Batch 1,2: drop the second bounding box [20, 20, 20, 3, 3, 3, 0, 2] + # and points, augmented_point_clouds = np.array( [ [ diff --git a/keras_cv/layers/preprocessing_3d/swap_background.py b/keras_cv/layers/preprocessing_3d/swap_background.py index 757fd51dbc..3fdbcf2449 100644 --- a/keras_cv/layers/preprocessing_3d/swap_background.py +++ b/keras_cv/layers/preprocessing_3d/swap_background.py @@ -28,16 +28,20 @@ @keras.utils.register_keras_serializable(package="keras_cv") class SwapBackground(base_augmentation_layer_3d.BaseAugmentationLayer3D): - """A preprocessing layer which swaps the backgrounds of two scenes during training. + """A preprocessing layer which swaps the backgrounds of two scenes during + training. - This layer will extract object point clouds and bounding boxes from an additional scene and paste it on to the training - scene while removing the objects in the training scene. - First, removing all the objects point clouds and bounding boxes in the training scene. - Second, extracting object point clouds and bounding boxes from an additional scene. - Third, removing backgrounds points clouds in the training scene that overlap with the additional object bounding boxes. - Last, pasting the additional object point clouds and bounding boxes to the training background scene. + This layer will extract object point clouds and bounding boxes from an + additional scene and paste it on to the training scene while removing the + objects in the training scene. First, removing all the objects point clouds + and bounding boxes in the training scene. Second, extracting object point + clouds and bounding boxes from an additional scene. Third, removing + backgrounds points clouds in the training scene that overlap with the + additional object bounding boxes. Last, pasting the additional object point + clouds and bounding boxes to the training background scene. - During inference time, the output will be identical to input. Call the layer with `training=True` to swap backgrounds between two scenes. + During inference time, the output will be identical to input. Call the layer + with `training=True` to swap backgrounds between two scenes. Input shape: point_clouds: 3D (multi frames) float32 Tensor with shape @@ -50,7 +54,8 @@ class SwapBackground(base_augmentation_layer_3d.BaseAugmentationLayer3D): for more details on supported bounding box formats. Output shape: - A tuple of two Tensors (point_clouds, bounding_boxes) with the same shape as input Tensors. + A tuple of two Tensors (point_clouds, bounding_boxes) with the same shape + as input Tensors. """ @@ -69,7 +74,8 @@ def get_random_transformation( additional_bounding_boxes, **kwargs ): - # Use the current frame bounding boxes to determine valid bounding boxes. + # Use the current frame bounding boxes to determine valid bounding + # boxes. bounding_boxes = tf.boolean_mask( bounding_boxes, bounding_boxes[0, :, CENTER_XYZ_DXDYDZ_PHI.CLASS] > 0, @@ -103,7 +109,8 @@ def get_random_transformation( 0.0, ) - # Remove backgorund points in point_clouds overlaps with additional_bounding_boxes. + # Remove background points in point_clouds overlaps with + # additional_bounding_boxes. points_overlaps_additional_bounding_boxes = is_within_any_box3d( point_clouds[..., :3], additional_bounding_boxes[..., : CENTER_XYZ_DXDYDZ_PHI.CLASS], diff --git a/keras_cv/layers/preprocessing_3d/swap_background_test.py b/keras_cv/layers/preprocessing_3d/swap_background_test.py index 5adc8ca1c1..71b244c89a 100644 --- a/keras_cv/layers/preprocessing_3d/swap_background_test.py +++ b/keras_cv/layers/preprocessing_3d/swap_background_test.py @@ -86,17 +86,27 @@ def test_augment_point_clouds_and_bounding_boxes(self): } outputs = add_layer(inputs) # The following points in additional_point_clouds. - # [0, 2, 1, 3, 4], -> kept because it is in additional_point_clouds [0, 0, 1, 4, 4, 4, 0, 1]. - # [0, 0, 2, 0, 2] -> removed because it is a background point (not in any bounding_boxes and additional_point_clouds). - # [0, 11, 2, 3, 4] -> removed because it is a background point (not in any bounding_boxes and additional_point_clouds). - # [100, 101, 2, 3, 4] -> kept because it is in additional_point_clouds [100, 100, 2, 5, 5, 5, 0, 1]. - # [10, 10, 10, 10, 10] -> removed because it is a background point (not in any bounding_boxes and additional_point_clouds). + # [0, 2, 1, 3, 4], -> kept because it is in additional_point_clouds + # [0, 0, 1, 4, 4, 4, 0, 1]. + # [0, 0, 2, 0, 2] -> removed because it is a background point (not in + # any bounding_boxes and additional_point_clouds). + # [0, 11, 2, 3, 4] -> removed because it is a background point (not in + # any bounding_boxes and additional_point_clouds). + # [100, 101, 2, 3, 4] -> kept because it is in additional_point_clouds + # [100, 100, 2, 5, 5, 5, 0, 1]. + # [10, 10, 10, 10, 10] -> removed because it is a background point (not + # in any bounding_boxes and additional_point_clouds). # The following points in point_clouds. - # [0, 1, 2, 3, 4] -> removed because it is in bounding_boxes [0, 0, 0, 4, 4, 4, 0, 1]. - # [10, 1, 2, 3, 4] -> kept because it is a background point (not in any bounding_boxes and additional_point_clouds). - # [0, -1, 2, 3, 4] -> removed becuase it overlaps with additional_bounding_boxes [0, 0, 1, 4, 4, 4, 0, 1]. - # [100, 100, 2, 3, 4] -> removed becuase it overlaps with additional_bounding_boxes [100, 100, 2, 5, 5, 5, 0, 1]. - # [20, 20, 21, 1, 0] -> kept because it is a background point (not in any bounding_boxes and additional_point_clouds). + # [0, 1, 2, 3, 4] -> removed because it is in bounding_boxes + # [0, 0, 0, 4, 4, 4, 0, 1]. + # [10, 1, 2, 3, 4] -> kept because it is a background point (not in any + # bounding_boxes and additional_point_clouds). + # [0, -1, 2, 3, 4] -> removed because it overlaps with + # additional_bounding_boxes [0, 0, 1, 4, 4, 4, 0, 1]. + # [100, 100, 2, 3, 4] -> removed because it overlaps with + # additional_bounding_boxes [100, 100, 2, 5, 5, 5, 0, 1]. + # [20, 20, 21, 1, 0] -> kept because it is a background point (not in + # any bounding_boxes and additional_point_clouds). augmented_point_clouds = np.array( [ [ @@ -204,17 +214,27 @@ def test_augment_batch_point_clouds_and_bounding_boxes(self): } outputs = add_layer(inputs) # The following points in additional_point_clouds. - # [0, 2, 1, 3, 4], -> kept because it is in additional_point_clouds [0, 0, 1, 4, 4, 4, 0, 1]. - # [0, 0, 2, 0, 2] -> removed because it is a background point (not in any bounding_boxes and additional_point_clouds). - # [0, 11, 2, 3, 4] -> removed because it is a background point (not in any bounding_boxes and additional_point_clouds). - # [100, 101, 2, 3, 4] -> kept because it is in additional_point_clouds [100, 100, 2, 5, 5, 5, 0, 1]. - # [10, 10, 10, 10, 10] -> removed because it is a background point (not in any bounding_boxes and additional_point_clouds). + # [0, 2, 1, 3, 4], -> kept because it is in additional_point_clouds + # [0, 0, 1, 4, 4, 4, 0, 1]. + # [0, 0, 2, 0, 2] -> removed because it is a background point (not in + # any bounding_boxes and additional_point_clouds). + # [0, 11, 2, 3, 4] -> removed because it is a background point (not in + # any bounding_boxes and additional_point_clouds). + # [100, 101, 2, 3, 4] -> kept because it is in additional_point_clouds + # [100, 100, 2, 5, 5, 5, 0, 1]. + # [10, 10, 10, 10, 10] -> removed because it is a background point (not + # in any bounding_boxes and additional_point_clouds). # The following points in point_clouds. - # [0, 1, 2, 3, 4] -> removed because it is in bounding_boxes [0, 0, 0, 4, 4, 4, 0, 1]. - # [10, 1, 2, 3, 4] -> kept because it is a background point (not in any bounding_boxes and additional_point_clouds). - # [0, -1, 2, 3, 4] -> removed becuase it overlaps with additional_bounding_boxes [0, 0, 1, 4, 4, 4, 0, 1]. - # [100, 100, 2, 3, 4] -> removed becuase it overlaps with additional_bounding_boxes [100, 100, 2, 5, 5, 5, 0, 1]. - # [20, 20, 21, 1, 0] -> kept because it is a background point (not in any bounding_boxes and additional_point_clouds). + # [0, 1, 2, 3, 4] -> removed because it is in bounding_boxes\ + # [0, 0, 0, 4, 4, 4, 0, 1]. + # [10, 1, 2, 3, 4] -> kept because it is a background point (not in any + # bounding_boxes and additional_point_clouds). + # [0, -1, 2, 3, 4] -> removed because it overlaps with + # additional_bounding_boxes [0, 0, 1, 4, 4, 4, 0, 1]. + # [100, 100, 2, 3, 4] -> removed because it overlaps with + # additional_bounding_boxes [100, 100, 2, 5, 5, 5, 0, 1]. + # [20, 20, 21, 1, 0] -> kept because it is a background point (not in + # any bounding_boxes and additional_point_clouds). augmented_point_clouds = np.array( [ [ diff --git a/keras_cv/layers/regularization/drop_path.py b/keras_cv/layers/regularization/drop_path.py index 39779f67d2..ea7de09396 100644 --- a/keras_cv/layers/regularization/drop_path.py +++ b/keras_cv/layers/regularization/drop_path.py @@ -18,11 +18,11 @@ @keras.utils.register_keras_serializable(package="keras_cv") class DropPath(keras.__internal__.layers.BaseRandomLayer): """ - Implements the DropPath layer. DropPath randomly drops samples during training - with a probability of `rate`. Note that this layer drops individual samples - within a batch and not the entire batch. DropPath randomly drops some of the - individual samples from a batch, whereas StachasticDepth randomly drops the - entire batch. + Implements the DropPath layer. DropPath randomly drops samples during + training with a probability of `rate`. Note that this layer drops individual + samples within a batch and not the entire batch. DropPath randomly drops + some individual samples from a batch, whereas StochasticDepth + randomly drops the entire batch. References: - [FractalNet](https://arxiv.org/abs/1605.07648v4). @@ -30,7 +30,7 @@ class DropPath(keras.__internal__.layers.BaseRandomLayer): Args: rate: float, the probability of the residual branch being dropped. - seed: (Optional) Integer. Used to create a random seed. + seed: (Optional) integer. Used to create a random seed. Usage: `DropPath` can be used in any network as follows: @@ -42,7 +42,7 @@ class DropPath(keras.__internal__.layers.BaseRandomLayer): output = keras_cv.layers.DropPath()(input) # (...) ``` - """ + """ # noqa: E501 def __init__(self, rate=0.5, seed=None, **kwargs): super().__init__(seed=seed, **kwargs) diff --git a/keras_cv/layers/regularization/dropblock_2d.py b/keras_cv/layers/regularization/dropblock_2d.py index b091f43bad..6c7114aad6 100644 --- a/keras_cv/layers/regularization/dropblock_2d.py +++ b/keras_cv/layers/regularization/dropblock_2d.py @@ -28,26 +28,23 @@ class DropBlock2D(BaseRandomLayer): dropout on convolutional layers due to the fact that activation units in convolutional layers are spatially correlated. - It is advised to use DropBlock after activation in Conv -> BatchNorm -> Activation - block in further layers of the network. For example, the paper mentions using - DropBlock in 3rd and 4th group of ResNet blocks. + It is advised to use DropBlock after activation in Conv -> BatchNorm -> + Activation block in further layers of the network. For example, the paper + mentions using DropBlock in 3rd and 4th group of ResNet blocks. Reference: - - [DropBlock: A regularization method for convolutional networks]( - https://arxiv.org/abs/1810.12890 - ) + - [DropBlock: A regularization method for convolutional networks](https://arxiv.org/abs/1810.12890) Args: rate: float. Probability of dropping a unit. Must be between 0 and 1. For best results, the value should be between 0.05-0.25. block_size: integer, or tuple of integers. The size of the block to be - dropped. In case of an integer a square block will be dropped. In case of a - tuple, the numbers are block's (height, width). - Must be bigger than 0, and should not be bigger than the input feature map - size. The paper authors use `block_size=7` for input feature's of size - `14x14xchannels`. - If this value is greater or equal to the input feature map size you will - encounter `nan` values. + dropped. In case of an integer a square block will be dropped. In + case of a tuple, the numbers are block's (height, width). Must be + bigger than 0, and should not be bigger than the input feature map + size. The paper authors use `block_size=7` for input feature's of + size `14x14xchannels`. If this value is greater or equal to the + input feature map size you will encounter `nan` values. seed: integer. To use as random seed. name: string. The name of the layer. @@ -61,8 +58,8 @@ class DropBlock2D(BaseRandomLayer): x = DropBlock2D(0.1, block_size=7)(x) # (...) ``` - When used directly, the layer will zero-out some inputs in a contiguous region and - normalize the remaining values. + When used directly, the layer will zero-out some inputs in a contiguous + region and normalize the remaining values. ```python # Small feature map shape for demonstration purposes: @@ -77,25 +74,27 @@ class DropBlock2D(BaseRandomLayer): # [0.38977218 0.80855536 0.6040567 0.10502195]]], shape=(1, 4, 4), # dtype=float32) - layer = DropBlock2D(0.1, block_size=2, seed=1234) # Small size for demonstration + layer = DropBlock2D(0.1, block_size=2, seed=1234) # Small size for + demonstration output = layer(features, training=True) # Preview the feature map after dropblock: print(output[..., 0]) # tf.Tensor( - # [[[0.10955477 0.54570675 0.5242462 0.42167106] - # [0.46290365 0.97599393 0. 0. ] - # [0.7365858 0.17468326 0. 0. ] - # [0.51969624 1.0780739 0.80540895 0.14002927]]], shape=(1, 4, 4), - # dtype=float32) + # [[[0.10955477 0.54570675 0.5242462 0.42167106] + # [0.46290365 0.97599393 0. 0. ] + # [0.7365858 0.17468326 0. 0. ] + # [0.51969624 1.0780739 0.80540895 0.14002927]]], + # shape=(1, 4, 4), + # dtype=float32) # We can observe two things: # 1. A 2x2 block has been dropped # 2. The inputs have been slightly scaled to account for missing values. - # The number of blocks dropped can vary, between the channels - sometimes no blocks - # will be dropped, and sometimes there will be multiple overlapping blocks. - # Let's present on a larger feature map: + # The number of blocks dropped can vary, between the channels - sometimes no + # blocks will be dropped, and sometimes there will be multiple overlapping + # blocks. Let's present on a larger feature map: features = tf.random.stateless_uniform((1, 4, 4, 36), seed=[0, 1]) layer = DropBlock2D(0.1, (2, 2), seed=123) @@ -103,36 +102,41 @@ class DropBlock2D(BaseRandomLayer): print(output[..., 0]) # no drop # tf.Tensor( - # [[[0.09136613 0.98085546 0.15265216 0.19690938] - # [0.48835075 0.52433217 0.1661478 0.7067729 ] - # [0.07383626 0.9938906 0.14309917 0.06882786] - # [0.43242374 0.04158871 0.24213943 0.1903095 ]]], shape=(1, 4, 4), - # dtype=float32) + # [[[0.09136613 0.98085546 0.15265216 0.19690938] + # [0.48835075 0.52433217 0.1661478 0.7067729 ] + # [0.07383626 0.9938906 0.14309917 0.06882786] + # [0.43242374 0.04158871 0.24213943 0.1903095 ]]], + # shape=(1, 4, 4), + # dtype=float32) print(output[..., 9]) # drop single block # tf.Tensor( - # [[[0.14568178 0.01571623 0.9082305 1.0545396 ] - # [0.24126057 0.86874676 0. 0. ] - # [0.44101703 0.29805306 0. 0. ] - # [0.56835717 0.04925899 0.6745584 0.20550345]]], shape=(1, 4, 4), dtype=float32) + # [[[0.14568178 0.01571623 0.9082305 1.0545396 ] + # [0.24126057 0.86874676 0. 0. ] + # [0.44101703 0.29805306 0. 0. ] + # [0.56835717 0.04925899 0.6745584 0.20550345]]], + # shape=(1, 4, 4), + # dtype=float32) print(output[..., 22]) # drop two blocks # tf.Tensor( - # [[[0.69479376 0.49463132 1.0627024 0.58349967] - # [0. 0. 0.36143216 0.58699244] - # [0. 0. 0. 0. ] - # [0.0315055 1.0117861 0. 0. ]]], shape=(1, 4, 4), - # dtype=float32) + # [[[0.69479376 0.49463132 1.0627024 0.58349967] + # [0. 0. 0.36143216 0.58699244] + # [0. 0. 0. 0. ] + # [0.0315055 1.0117861 0. 0. ]]], + # shape=(1, 4, 4), + # dtype=float32) print(output[..., 29]) # drop two blocks with overlap # tf.Tensor( - # [[[0.2137237 0.9120104 0.9963533 0.33937347] - # [0.21868704 0.44030213 0.5068906 0.20034194] - # [0. 0. 0. 0.5915383 ] - # [0. 0. 0. 0.9526224 ]]], shape=(1, 4, 4), - # dtype=float32) + # [[[0.2137237 0.9120104 0.9963533 0.33937347] + # [0.21868704 0.44030213 0.5068906 0.20034194] + # [0. 0. 0. 0.5915383 ] + # [0. 0. 0. 0.9526224 ]]], + # shape=(1, 4, 4), + # dtype=float32) ``` - """ + """ # noqa: E501 def __init__( self, diff --git a/keras_cv/layers/regularization/squeeze_excite.py b/keras_cv/layers/regularization/squeeze_excite.py index 0b133642c1..e70bad149e 100644 --- a/keras_cv/layers/regularization/squeeze_excite.py +++ b/keras_cv/layers/regularization/squeeze_excite.py @@ -26,8 +26,8 @@ class SqueezeAndExcite2D(layers.Layer): weights adaptively. It first squeezes the feature maps into a single value using global average pooling, which are then fed into two Conv1D layers, which act like fully-connected layers. The first layer reduces the - dimensionality of the feature maps by a factor of `ratio`, whereas the second - layer restores it to its original value. + dimensionality of the feature maps by a factor of `ratio`, whereas the + second layer restores it to its original value. The resultant values are the adaptive weights for each channel. These weights are then multiplied with the original inputs to scale the outputs @@ -38,13 +38,15 @@ class SqueezeAndExcite2D(layers.Layer): filters: Number of input and output filters. The number of input and output filters is same. ratio: Ratio for bottleneck filters. Number of bottleneck filters = - filters * ratio. Defaults to 0.25. - squeeze_activation: (Optional) String, callable (or keras.layers.Layer) or - keras.activations.Activation instance denoting activation to - be applied after squeeze convolution. Defaults to `relu`. - excite_activation: (Optional) String, callable (or keras.layers.Layer) or - keras.activations.Activation instance denoting activation to - be applied after excite convolution. Defaults to `sigmoid`. + filters * ratio, defaults to 0.25. + squeeze_activation: (Optional) String, callable (or + keras.layers.Layer) or keras.activations.Activation instance + denoting activation to be applied after squeeze convolution. + Defaults to `relu`. + excite_activation: (Optional) String, callable (or + keras.layers.Layer) or keras.activations.Activation instance + denoting activation to be applied after excite convolution. + Defaults to `sigmoid`. Usage: ```python diff --git a/keras_cv/layers/regularization/stochastic_depth.py b/keras_cv/layers/regularization/stochastic_depth.py index 3dfc460ed6..dda290b4e1 100644 --- a/keras_cv/layers/regularization/stochastic_depth.py +++ b/keras_cv/layers/regularization/stochastic_depth.py @@ -24,8 +24,8 @@ class StochasticDepth(keras.layers.Layer): individual samples but across the entire batch. Reference: - - [Deep Networks with Stochastic Depth](https://arxiv.org/abs/1603.09382). - - Docstring taken from [stochastic_depth.py](https://tinyurl.com/mr3y2af6) + - [Deep Networks with Stochastic Depth](https://arxiv.org/abs/1603.09382) + - [Docstring taken from [stochastic_depth.py](https://tinyurl.com/mr3y2af6) Args: rate: float, the probability of the residual branch being dropped. @@ -51,7 +51,7 @@ class StochasticDepth(keras.layers.Layer): $$ x[0] + (1 - rate) * x[1] $$ - """ + """ # noqa: E501 def __init__(self, rate=0.5, **kwargs): super().__init__(**kwargs) diff --git a/keras_cv/layers/spatial_pyramid.py b/keras_cv/layers/spatial_pyramid.py index 3a904b9336..c3ae2a3d9d 100644 --- a/keras_cv/layers/spatial_pyramid.py +++ b/keras_cv/layers/spatial_pyramid.py @@ -25,19 +25,19 @@ class SpatialPyramidPooling(keras.layers.Layer): """Implements the Atrous Spatial Pyramid Pooling. References: - [Rethinking Atrous Convolution for Semantic Image Segmentation]( - https://arxiv.org/pdf/1706.05587.pdf) - [Encoder-Decoder with Atrous Separable Convolution for Semantic Image - Segmentation](https://arxiv.org/pdf/1802.02611.pdf) + [Rethinking Atrous Convolution for Semantic Image Segmentation](https://arxiv.org/pdf/1706.05587.pdf) + [Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation](https://arxiv.org/pdf/1802.02611.pdf) inp = keras.layers.Input((384, 384, 3)) - backbone = keras.applications.EfficientNetB0(input_tensor=inp, include_top=False) + backbone = keras.applications.EfficientNetB0( + input_tensor=inp, + include_top=False) output = backbone(inp) output = keras_cv.layers.SpatialPyramidPooling( dilation_rates=[6, 12, 18])(output) # output[4].shape = [None, 16, 16, 256] - """ + """ # noqa: E501 def __init__( self, @@ -50,13 +50,13 @@ def __init__( """Initializes an Atrous Spatial Pyramid Pooling layer. Args: - dilation_rates: A `list` of integers for parallel dilated conv. Usually a - sample choice of rates are [6, 12, 18]. - num_channels: An `int` number of output channels. Default to 256. - activation: A `str` activation to be used. Default to 'relu'. - dropout: A `float` for the dropout rate of the final projection output after - the activations and batch norm. Default to 0.0, which means no dropout is - applied to the output. + dilation_rates: A `list` of integers for parallel dilated conv. + Usually a sample choice of rates are [6, 12, 18]. + num_channels: An `int` number of output channels, defaults to 256. + activation: A `str` activation to be used, defaults to 'relu'. + dropout: A `float` for the dropout rate of the final projection + output after the activations and batch norm, defaults to 0.0, + which means no dropout is applied to the output. **kwargs: Additional keyword arguments to be passed. """ super().__init__(**kwargs) @@ -70,9 +70,9 @@ def build(self, input_shape): width = input_shape[2] channels = input_shape[3] - # This is the parallel networks that process the input features with different - # dilation rates. The output from each channel will be merged together and feed - # to the output. + # This is the parallel networks that process the input features with + # different dilation rates. The output from each channel will be merged + # together and feed to the output. self.aspp_parallel_channels = [] # Channel1 with Conv2D and 1x1 kernel size. @@ -89,8 +89,8 @@ def build(self, input_shape): ) self.aspp_parallel_channels.append(conv_sequential) - # Channel 2 and afterwards are based on self.dilation_rates, and each of them - # will have conv2D with 3x3 kernel size. + # Channel 2 and afterwards are based on self.dilation_rates, and each of + # them will have conv2D with 3x3 kernel size. for dilation_rate in self.dilation_rates: conv_sequential = keras.Sequential( [ diff --git a/keras_cv/layers/transformer_encoder.py b/keras_cv/layers/transformer_encoder.py index 15b756821d..da54083646 100644 --- a/keras_cv/layers/transformer_encoder.py +++ b/keras_cv/layers/transformer_encoder.py @@ -22,13 +22,19 @@ class TransformerEncoder(layers.Layer): Transformer encoder block implementation as a Keras Layer. Args: - project_dim: the dimensionality of the projection of the encoder, and output of the `MultiHeadAttention` - mlp_dim: the intermediate dimensionality of the MLP head before projecting to `project_dim` + project_dim: the dimensionality of the projection of the encoder, and + output of the `MultiHeadAttention` + mlp_dim: the intermediate dimensionality of the MLP head before + projecting to `project_dim` num_heads: the number of heads for the `MultiHeadAttention` layer - mlp_dropout: default 0.1, the dropout rate to apply between the layers of the MLP head of the encoder - attention_dropout: default 0.1, the dropout rate to apply in the MultiHeadAttention layer - activation: default 'tf.activations.gelu', the activation function to apply in the MLP head - should be a function - layer_norm_epsilon: default 1e-06, the epsilon for `LayerNormalization` layers + mlp_dropout: default 0.1, the dropout rate to apply between the layers + of the MLP head of the encoder + attention_dropout: default 0.1, the dropout rate to apply in the + MultiHeadAttention layer + activation: default 'tf.activations.gelu', the activation function to + apply in the MLP head - should be a function + layer_norm_epsilon: default 1e-06, the epsilon for `LayerNormalization` + layers Basic usage: @@ -37,10 +43,12 @@ class TransformerEncoder(layers.Layer): mlp_dim = 3072 num_heads = 4 - encoded_patches = keras_cv.layers.PatchingAndEmbedding(project_dim=project_dim, patch_size=16)(img_batch) + encoded_patches = keras_cv.layers.PatchingAndEmbedding( + project_dim=project_dim, + patch_size=16)(img_batch) trans_encoded = keras_cv.layers.TransformerEncoder(project_dim=project_dim, - mlp_dim = mlp_dim, - num_heads=num_heads)(encoded_patches) + mlp_dim = mlp_dim, + num_heads=num_heads)(encoded_patches) print(trans_encoded.shape) # (1, 197, 1024) ``` @@ -92,7 +100,9 @@ def call(self, inputs): if inputs.shape[-1] != self.project_dim: raise ValueError( - f"The input and output dimensionality must be the same, but the TransformerEncoder was provided with {inputs.shape[-1]} and {self.project_dim}" + "The input and output dimensionality must be the same, but the " + f"TransformerEncoder was provided with {inputs.shape[-1]} and " + f"{self.project_dim}" ) x = self.layer_norm1(inputs) diff --git a/keras_cv/layers/transformer_encoder_test.py b/keras_cv/layers/transformer_encoder_test.py index 1423f5cada..fa39af780f 100644 --- a/keras_cv/layers/transformer_encoder_test.py +++ b/keras_cv/layers/transformer_encoder_test.py @@ -34,7 +34,8 @@ def test_wrong_input_dims(self): inputs = tf.random.normal([1, 197, 256]) with self.assertRaisesRegexp( ValueError, - "The input and output dimensionality must be the same, but the TransformerEncoder was provided with 256 and 128", + "The input and output dimensionality must be the same, but the " + "TransformerEncoder was provided with 256 and 128", ): layer(inputs, training=True) @@ -45,6 +46,7 @@ def test_wrong_project_dims(self): inputs = tf.random.normal([1, 197, 128]) with self.assertRaisesRegexp( ValueError, - "The input and output dimensionality must be the same, but the TransformerEncoder was provided with 128 and 256", + "The input and output dimensionality must be the same, but the " + "TransformerEncoder was provided with 128 and 256", ): layer(inputs, training=True) diff --git a/keras_cv/layers/vit_layers.py b/keras_cv/layers/vit_layers.py index a63b64d751..ed4b0e7769 100644 --- a/keras_cv/layers/vit_layers.py +++ b/keras_cv/layers/vit_layers.py @@ -26,13 +26,14 @@ class PatchingAndEmbedding(layers.Layer): Layer to patchify images, prepend a class token, positionally embed and create a projection of patches for Vision Transformers - The layer expects a batch of input images and returns batches of patches, flattened as a sequence - and projected onto `project_dims`. If the height and width of the images - aren't divisible by the patch size, the supplied padding type is used (or 'VALID' by default). + The layer expects a batch of input images and returns batches of patches, + flattened as a sequence and projected onto `project_dims`. If the height and + width of the images aren't divisible by the patch size, the supplied padding + type is used (or 'VALID' by default). Reference: - An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale - by Alexey Dosovitskiy et al. (https://arxiv.org/abs/2010.11929) + An Image is Worth 16x16 Words: Transformers for Image Recognition at + Scale by Alexey Dosovitskiy et al. (https://arxiv.org/abs/2010.11929) Args: project_dim: the dimensionality of the project_dim @@ -40,15 +41,16 @@ class PatchingAndEmbedding(layers.Layer): padding: default 'VALID', the padding to apply for patchifying images Returns: - Patchified and linearly projected input images, including a prepended learnable class token - with shape (batch, num_patches+1, project_dim) + Patchified and linearly projected input images, including a prepended + learnable class token with shape (batch, num_patches+1, project_dim) Basic usage: ``` images = #... batch of images - encoded_patches = keras_cv.layers.PatchingAndEmbedding(project_dim=project_dim - patch_size=patch_size)(patches) + encoded_patches = keras_cv.layers.PatchingAndEmbedding( + project_dim=project_dim, + patch_size=patch_size)(patches) print(encoded_patches.shape) # (1, 197, 1024) ``` """ @@ -60,11 +62,13 @@ def __init__(self, project_dim, patch_size, padding="VALID", **kwargs): self.padding = padding if patch_size < 0: raise ValueError( - f"The patch_size cannot be a negative number. Received {patch_size}" + "The patch_size cannot be a negative number. Received " + f"{patch_size}" ) if padding not in ["VALID", "SAME"]: raise ValueError( - f"Padding must be either 'SAME' or 'VALID', but {padding} was passed." + f"Padding must be either 'SAME' or 'VALID', but {padding} was " + "passed." ) self.projection = layers.Conv2D( filters=self.project_dim, @@ -101,7 +105,8 @@ def call( interpolate: A `bool` to enable or disable interpolation interpolate_height: An `int` representing interpolated height interpolate_width: An `int` representing interpolated width - patch_size: An `int` representing the new patch size if interpolation is used + patch_size: An `int` representing the new patch size if + interpolation is used Returns: `A tf.Tensor` of shape [batch, patch_num+1, embedding_dim] @@ -118,7 +123,8 @@ def call( ), ) - # Add learnable class token before linear projection and positional embedding + # Add learnable class token before linear projection and positional + # embedding flattened_shapes = tf.shape(patches_flattened) class_token_broadcast = tf.cast( tf.broadcast_to( @@ -154,7 +160,8 @@ def call( patch_size, ): raise ValueError( - "`None of `interpolate_width`, `interpolate_height` and `patch_size` cannot be None if `interpolate` is True" + "`None of `interpolate_width`, `interpolate_height` and " + "`patch_size` cannot be None if `interpolate` is True" ) else: encoded = patches_flattened + self.position_embedding(positions) @@ -164,8 +171,9 @@ def __interpolate_positional_embeddings( self, embedding, height, width, patch_size ): """ - Allows for pre-trained position embedding interpolation. This trick allows you to fine-tune a ViT - on higher resolution images than it was trained on. + Allows for pre-trained position embedding interpolation. This trick + allows you to fine-tune a ViT on higher resolution images than it was + trained on. Based on: https://github.com/huggingface/transformers/blob/main/src/transformers/models/vit/modeling_tf_vit.py diff --git a/keras_cv/layers/vit_layers_test.py b/keras_cv/layers/vit_layers_test.py index 3d8fd0d3be..0aa39b4169 100644 --- a/keras_cv/layers/vit_layers_test.py +++ b/keras_cv/layers/vit_layers_test.py @@ -50,7 +50,7 @@ def test_patch_embedding_interpolation(self): ( output, cls, - ) = patch_embedding._PatchingAndEmbedding__interpolate_positional_embeddings( + ) = patch_embedding._PatchingAndEmbedding__interpolate_positional_embeddings( # noqa: E501 positional_embeddings, height=450, width=450, patch_size=12 ) @@ -67,7 +67,7 @@ def test_patch_embedding_interpolation_numerical(self): ( output, cls_token, - ) = patch_embedding._PatchingAndEmbedding__interpolate_positional_embeddings( + ) = patch_embedding._PatchingAndEmbedding__interpolate_positional_embeddings( # noqa: E501 positional_embeddings, height=8, width=8, patch_size=2 ) diff --git a/keras_cv/losses/center_net_box_loss.py b/keras_cv/losses/center_net_box_loss.py index d48e7f9a61..c565b5ed13 100644 --- a/keras_cv/losses/center_net_box_loss.py +++ b/keras_cv/losses/center_net_box_loss.py @@ -56,7 +56,7 @@ class CenterNetBoxLoss(keras.losses.Loss): Args: num_heading_bins: int, number of bins used for predicting box heading. anchor_size: list of 3 ints, anchor sizes for the x, y, and z axes. - """ + """ # noqa: E501 def __init__(self, num_heading_bins, anchor_size, **kwargs): super().__init__(**kwargs) diff --git a/keras_cv/losses/focal.py b/keras_cv/losses/focal.py index 8294d88a70..696c788c87 100644 --- a/keras_cv/losses/focal.py +++ b/keras_cv/losses/focal.py @@ -30,14 +30,14 @@ class imbalance. For this reason, it's commonly used with object detectors. classes have alpha and (1 - alpha) as their weighting factors respectively. Defaults to 0.25. gamma: a positive float value representing the tunable focusing - parameter. Defaults to 2. + parameter, defaults to 2. from_logits: Whether `y_pred` is expected to be a logits tensor. By default, `y_pred` is assumed to encode a probability distribution. Default to `False`. label_smoothing: Float in `[0, 1]`. If higher than 0 then smooth the - labels by squeezing them towards `0.5`, i.e., using `1. - 0.5 * label_smoothing` - for the target class and `0.5 * label_smoothing` for the non-target - class. + labels by squeezing them towards `0.5`, i.e., using + `1. - 0.5 * label_smoothing` for the target class and + `0.5 * label_smoothing` for the non-target class. References: - [Focal Loss paper](https://arxiv.org/abs/1708.02002) @@ -91,7 +91,7 @@ def call(self, y_true, y_pred): loss = alpha * tf.pow(1.0 - pt, self.gamma) * cross_entropy # In most losses you mean over the final axis to achieve a scalar # Focal loss however is a special case in that it is meant to focus on - # a small number of hard examples in a batch. Most of the time this + # a small number of hard examples in a batch. Most of the time this # comes in the form of thousands of background class boxes and a few # positive boxes. # If you mean over the final axis you will get a number close to 0, diff --git a/keras_cv/losses/giou_loss.py b/keras_cv/losses/giou_loss.py index c44c7c446f..cdb984e02e 100644 --- a/keras_cv/losses/giou_loss.py +++ b/keras_cv/losses/giou_loss.py @@ -24,18 +24,19 @@ class GIoULoss(keras.losses.Loss): """Implements the Generalized IoU Loss - GIoU loss is a modified IoU loss commonly used for object detection. This loss aims - to directly optimize the IoU score between true boxes and predicted boxes. GIoU loss - adds a penalty term to the IoU loss that takes in account the area of the - smallest box enclosing both the boxes being considered for the iou. The length of - the last dimension should be 4 to represent the bounding boxes. + GIoU loss is a modified IoU loss commonly used for object detection. This + loss aims to directly optimize the IoU score between true boxes and + predicted boxes. GIoU loss adds a penalty term to the IoU loss that takes in + account the area of the smallest box enclosing both the boxes being + considered for the iou. The length of the last dimension should be 4 to + represent the bounding boxes. Args: bounding_box_format: a case-insensitive string (for example, "xyxy"). - Each bounding box is defined by these 4 values.For detailed information - on the supported formats, see the - [KerasCV bounding box documentation](https://keras.io/api/keras_cv/bounding_box/formats/). - axis: the axis along which to mean the ious. Defaults to -1. + Each bounding box is defined by these 4 values.For detailed + information on the supported formats, see the [KerasCV bounding box + documentation](https://keras.io/api/keras_cv/bounding_box/formats/). + axis: the axis along which to mean the ious, defaults to -1. References: - [GIoU paper](https://arxiv.org/pdf/1902.09630) @@ -43,8 +44,16 @@ class GIoULoss(keras.losses.Loss): Sample Usage: ```python - y_true = tf.random.uniform((5, 10, 5), minval=0, maxval=10, dtype=tf.dtypes.int32) - y_pred = tf.random.uniform((5, 10, 4), minval=0, maxval=10, dtype=tf.dtypes.int32) + y_true = tf.random.uniform( + (5, 10, 5), + minval=0, + maxval=10, + dtype=tf.dtypes.int32) + y_pred = tf.random.uniform( + (5, 10, 4), + minval=0, + maxval=10, + dtype=tf.dtypes.int32) loss = GIoULoss(bounding_box_format = "xyWH") loss(y_true, y_pred).numpy() ``` @@ -53,7 +62,7 @@ class GIoULoss(keras.losses.Loss): ```python model.compile(optimizer='adam', loss=keras_cv.losses.GIoULoss()) ``` - """ + """ # noqa: E501 def __init__(self, bounding_box_format, axis=-1, **kwargs): super().__init__(**kwargs) @@ -88,17 +97,17 @@ def _compute_giou(self, boxes1, boxes2): if boxes1_rank not in [2, 3]: raise ValueError( - "compute_iou() expects boxes1 to be batched, or " - f"to be unbatched. Received len(boxes1.shape)={boxes1_rank}, " - f"len(boxes2.shape)={boxes2_rank}. Expected either len(boxes1.shape)=2 AND " - "or len(boxes1.shape)=3." + "compute_iou() expects boxes1 to be batched, or to be " + f"unbatched. Received len(boxes1.shape)={boxes1_rank}, " + f"len(boxes2.shape)={boxes2_rank}. Expected either " + "len(boxes1.shape)=2 AND or len(boxes1.shape)=3." ) if boxes2_rank not in [2, 3]: raise ValueError( - "compute_iou() expects boxes2 to be batched, or " - f"to be unbatched. Received len(boxes1.shape)={boxes1_rank}, " - f"len(boxes2.shape)={boxes2_rank}. Expected either len(boxes2.shape)=2 AND " - "or len(boxes2.shape)=3." + "compute_iou() expects boxes2 to be batched, or to be " + f"unbatched. Received len(boxes1.shape)={boxes1_rank}, " + f"len(boxes2.shape)={boxes2_rank}. Expected either " + "len(boxes2.shape)=2 AND or len(boxes2.shape)=3." ) target_format = "yxyx" @@ -133,8 +142,8 @@ def _compute_giou(self, boxes1, boxes2): def call(self, y_true, y_pred, sample_weight=None): if sample_weight is not None: raise ValueError( - "GIoULoss does not support sample_weight. Please ensure that sample_weight=None." - f"got sample_weight={sample_weight}" + "GIoULoss does not support sample_weight. Please ensure " + f"sample_weight=None. Got sample_weight={sample_weight}" ) y_pred = tf.convert_to_tensor(y_pred) @@ -142,30 +151,31 @@ def call(self, y_true, y_pred, sample_weight=None): if y_pred.shape[-1] != 4: raise ValueError( - "GIoULoss expects y_pred.shape[-1] to be 4 to represent " - f"the bounding boxes. Received y_pred.shape[-1]={y_pred.shape[-1]}." + "GIoULoss expects y_pred.shape[-1] to be 4 to represent the " + f"bounding boxes. Received y_pred.shape[-1]={y_pred.shape[-1]}." ) if y_true.shape[-1] != 4: raise ValueError( - "GIoULoss expects y_true.shape[-1] to be 4 to represent " - f"the bounding boxes. Received y_true.shape[-1]={y_true.shape[-1]}." + "GIoULoss expects y_true.shape[-1] to be 4 to represent the " + f"bounding boxes. Received y_true.shape[-1]={y_true.shape[-1]}." ) if y_true.shape[-2] != y_pred.shape[-2]: raise ValueError( - "GIoULoss expects number of boxes in y_pred to be equal to the number " - f"of boxes in y_true. Received number of boxes in y_true={y_true.shape[-2]} " - f"and number of boxes in y_pred={y_pred.shape[-2]}." + "GIoULoss expects number of boxes in y_pred to be equal to the " + "number of boxes in y_true. Received number of boxes in " + f"y_true={y_true.shape[-2]} and number of boxes in " + f"y_pred={y_pred.shape[-2]}." ) giou = self._compute_giou(y_true, y_pred) giou = tf.linalg.diag_part(giou) if self.axis == "no_reduction": warnings.warn( - "`axis='no_reduction'` is a temporary API, and the API contract " - "will be replaced in the future with a more generic solution " - "covering all losses." + "`axis='no_reduction'` is a temporary API, and the API " + "contract will be replaced in the future with a more generic " + "solution covering all losses." ) else: giou = tf.reduce_mean(giou, axis=self.axis) diff --git a/keras_cv/losses/iou_loss.py b/keras_cv/losses/iou_loss.py index 15bad50140..36f2642a64 100644 --- a/keras_cv/losses/iou_loss.py +++ b/keras_cv/losses/iou_loss.py @@ -25,11 +25,11 @@ class IoULoss(keras.losses.Loss): """Implements the IoU Loss IoU loss is commonly used for object detection. This loss aims to directly - optimize the IoU score between true boxes and predicted boxes. The length of the - last dimension should be 4 to represent the bounding boxes. This loss - uses IoUs according to box pairs and therefore, the number of boxes in both y_true - and y_pred are expected to be equal i.e. the ith y_true box in a batch - will be compared the ith y_pred box. + optimize the IoU score between true boxes and predicted boxes. The length of + the last dimension should be 4 to represent the bounding boxes. This loss + uses IoUs according to box pairs and therefore, the number of boxes in both + y_true and y_pred are expected to be equal i.e. the ith + y_true box in a batch will be compared the ith y_pred box. Args: bounding_box_format: a case-insensitive string (for example, "xyxy"). @@ -41,15 +41,23 @@ class IoULoss(keras.losses.Loss): - `"quadratic"`. The loss will be calculated as 1 - iou2 - `"log"`. The loss will be calculated as -ln(iou) Defaults to "log". - axis: the axis along which to mean the ious. Defaults to -1. + axis: the axis along which to mean the ious, defaults to -1. References: - [UnitBox paper](https://arxiv.org/pdf/1608.01471) Sample Usage: ```python - y_true = tf.random.uniform((5, 10, 5), minval=0, maxval=10, dtype=tf.dtypes.int32) - y_pred = tf.random.uniform((5, 10, 4), minval=0, maxval=10, dtype=tf.dtypes.int32) + y_true = tf.random.uniform( + (5, 10, 5), + minval=0, + maxval=10, + dtype=tf.dtypes.int32) + y_pred = tf.random.uniform( + (5, 10, 4), + minval=0, + maxval=10, + dtype=tf.dtypes.int32) loss = IoULoss(bounding_box_format = "xyWH") loss(y_true, y_pred).numpy() ``` @@ -58,7 +66,7 @@ class IoULoss(keras.losses.Loss): ```python model.compile(optimizer='adam', loss=keras_cv.losses.IoULoss()) ``` - """ + """ # noqa: E501 def __init__(self, bounding_box_format, mode="log", axis=-1, **kwargs): super().__init__(**kwargs) @@ -68,8 +76,8 @@ def __init__(self, bounding_box_format, mode="log", axis=-1, **kwargs): if self.mode not in ["linear", "quadratic", "log"]: raise ValueError( - "IoULoss expects mode to be one of 'linear', 'quadratic' or 'log' " - f"Received mode={self.mode}, " + "IoULoss expects mode to be one of 'linear', 'quadratic' or " + f"'log' Received mode={self.mode}, " ) def call(self, y_true, y_pred): @@ -78,21 +86,22 @@ def call(self, y_true, y_pred): if y_pred.shape[-1] != 4: raise ValueError( - "IoULoss expects y_pred.shape[-1] to be 4 to represent " - f"the bounding boxes. Received y_pred.shape[-1]={y_pred.shape[-1]}." + "IoULoss expects y_pred.shape[-1] to be 4 to represent the " + f"bounding boxes. Received y_pred.shape[-1]={y_pred.shape[-1]}." ) if y_true.shape[-1] != 4: raise ValueError( - "IoULoss expects y_true.shape[-1] to be 4 to represent " - f"the bounding boxes. Received y_true.shape[-1]={y_true.shape[-1]}." + "IoULoss expects y_true.shape[-1] to be 4 to represent the " + f"bounding boxes. Received y_true.shape[-1]={y_true.shape[-1]}." ) if y_true.shape[-2] != y_pred.shape[-2]: raise ValueError( - "IoULoss expects number of boxes in y_pred to be equal to the number " - f"of boxes in y_true. Received number of boxes in y_true={y_true.shape[-2]} " - f"and number of boxes in y_pred={y_pred.shape[-2]}." + "IoULoss expects number of boxes in y_pred to be equal to the " + "number of boxes in y_true. Received number of boxes in " + f"y_true={y_true.shape[-2]} and number of boxes in " + f"y_pred={y_pred.shape[-2]}." ) iou = bounding_box.compute_iou(y_true, y_pred, self.bounding_box_format) @@ -100,9 +109,9 @@ def call(self, y_true, y_pred): iou = tf.linalg.diag_part(iou) if self.axis == "no_reduction": warnings.warn( - "`axis='no_reduction'` is a temporary API, and the API contract " - "will be replaced in the future with a more generic solution " - "covering all losses." + "`axis='no_reduction'` is a temporary API, and the API " + "contract will be replaced in the future with a more generic " + "solution covering all losses." ) else: iou = tf.reduce_mean(iou, axis=self.axis) diff --git a/keras_cv/losses/numerical_tests/focal_loss_numerical_test.py b/keras_cv/losses/numerical_tests/focal_loss_numerical_test.py index ee3535eef8..32019e5076 100644 --- a/keras_cv/losses/numerical_tests/focal_loss_numerical_test.py +++ b/keras_cv/losses/numerical_tests/focal_loss_numerical_test.py @@ -38,7 +38,8 @@ def call(self, y_true, y_pred): ) probs = tf.sigmoid(y_pred) probs_gt = tf.where(positive_label_mask, probs, 1.0 - probs) - # With small gamma, the implementation could produce NaN during back prop. + # With small gamma, the implementation could produce NaN during back + # prop. modulator = tf.pow(1.0 - probs_gt, self._gamma) loss = modulator * cross_entropy weighted_loss = tf.where( diff --git a/keras_cv/losses/penalty_reduced_focal_loss.py b/keras_cv/losses/penalty_reduced_focal_loss.py index e950186f77..665675e522 100644 --- a/keras_cv/losses/penalty_reduced_focal_loss.py +++ b/keras_cv/losses/penalty_reduced_focal_loss.py @@ -17,27 +17,32 @@ # TODO(tanzhenyu): consider inherit from LossFunctionWrapper to -# get the dimension squeeze. +# get the dimension squeeze. @keras.utils.register_keras_serializable(package="keras_cv") class BinaryPenaltyReducedFocalCrossEntropy(keras.losses.Loss): """Implements CenterNet modified Focal loss. - Compared with `keras.losses.BinaryFocalCrossentropy`, this loss discounts for negative - labels that have value less than `positive_threshold`, the larger value the negative label - is, the more discount to the final loss. + Compared with `keras.losses.BinaryFocalCrossentropy`, this loss discounts + for negative labels that have value less than `positive_threshold`, the + larger value the negative label is, the more discount to the final loss. - User can choose to divide the number of keypoints outside the loss computation, or by - passing in `sample_weight` as 1.0/num_key_points. + User can choose to divide the number of keypoints outside the loss + computation, or by passing in `sample_weight` as 1.0/num_key_points. Args: alpha: a focusing parameter used to compute the focal factor. - Defaults to 2.0. Note, this is equivalent to the `gamma` parameter in `keras.losses.BinaryFocalCrossentropy`. - beta: a float parameter, penalty exponent for negative labels. Defaults to 4.0. - from_logits: Whether `y_pred` is expected to be a logits tensor. Defaults + Defaults to 2.0. Note, this is equivalent to the `gamma` parameter in + `keras.losses.BinaryFocalCrossentropy`. + beta: a float parameter, penalty exponent for negative labels, defaults to + 4.0. + from_logits: Whether `y_pred` is expected to be a logits tensor, defaults to `False`. - positive_threshold: Anything bigger than this is treated as positive label. Defaults to 0.99. - positive_weight: single scalar weight on positive examples. Defaults to 1.0. - negative_weight: single scalar weight on negative examples. Defaults to 1.0. + positive_threshold: Anything bigger than this is treated as positive + label, defaults to 0.99. + positive_weight: single scalar weight on positive examples, defaults to + 1.0. + negative_weight: single scalar weight on negative examples, defaults to + 1.0. Inputs: y_true: [batch_size, ...] float tensor @@ -45,8 +50,9 @@ class BinaryPenaltyReducedFocalCrossEntropy(keras.losses.Loss): References: - [Objects as Points](https://arxiv.org/pdf/1904.07850.pdf) Eq 1. - - [Cornernet: Detecting objects as paired keypoints](https://arxiv.org/abs/1808.01244) for `alpha` and `beta`. - """ + - [Cornernet: Detecting objects as paired keypoints](https://arxiv.org/abs/1808.01244) for `alpha` and + `beta`. + """ # noqa: E501 def __init__( self, @@ -74,8 +80,8 @@ def call(self, y_true, y_pred): if self.from_logits: y_pred = tf.nn.sigmoid(y_pred) - # TODO(tanzhenyu): Evaluate whether we need clipping after - # model is trained. + # TODO(tanzhenyu): Evaluate whether we need clipping after model is + # trained. y_pred = tf.clip_by_value(y_pred, 1e-4, 0.9999) y_true = tf.clip_by_value(y_true, 0.0, 1.0) diff --git a/keras_cv/losses/simclr_loss.py b/keras_cv/losses/simclr_loss.py index 25ce6d83f6..f40e19e76a 100644 --- a/keras_cv/losses/simclr_loss.py +++ b/keras_cv/losses/simclr_loss.py @@ -24,7 +24,8 @@ class SimCLRLoss(keras.losses.Loss): SimCLR loss is used for contrastive self-supervised learning. Args: - temperature: a float value between 0 and 1, used as a scaling factor for cosine similarity. + temperature: a float value between 0 and 1, used as a scaling factor for + cosine similarity. References: - [SimCLR paper](https://arxiv.org/pdf/2002.05709) @@ -35,14 +36,18 @@ def __init__(self, temperature, **kwargs): self.temperature = temperature def call(self, projections_1, projections_2): - """Computes SimCLR loss for a pair of projections in a contrastive learning trainer. + """Computes SimCLR loss for a pair of projections in a contrastive + learning trainer. - Note that unlike most loss functions, this should not be called with y_true and y_pred, - but with two unlabeled projections. It can otherwise be treated as a normal loss function. + Note that unlike most loss functions, this should not be called with + y_true and y_pred, but with two unlabeled projections. It can otherwise + be treated as a normal loss function. Args: - projections_1: a tensor with the output of the first projection model in a contrastive learning trainer - projections_2: a tensor with the output of the second projection model in a contrastive learning trainer + projections_1: a tensor with the output of the first projection + model in a contrastive learning trainer + projections_2: a tensor with the output of the second projection + model in a contrastive learning trainer Returns: A tensor with the SimCLR loss computed from the input projections diff --git a/keras_cv/losses/smooth_l1.py b/keras_cv/losses/smooth_l1.py index e6a6ebb5b3..e506831fe2 100644 --- a/keras_cv/losses/smooth_l1.py +++ b/keras_cv/losses/smooth_l1.py @@ -20,13 +20,14 @@ class SmoothL1Loss(keras.losses.Loss): """Implements Smooth L1 loss. - SmoothL1Loss implements the SmoothL1 function, where values less than `l1_cutoff` - contribute to the overall loss based on their squared difference, and values greater - than l1_cutoff contribute based on their raw difference. + SmoothL1Loss implements the SmoothL1 function, where values less than + `l1_cutoff` contribute to the overall loss based on their squared + difference, and values greater than l1_cutoff contribute based on their raw + difference. Args: - l1_cutoff: differences between y_true and y_pred that are larger than `l1_cutoff` are - treated as `L1` values + l1_cutoff: differences between y_true and y_pred that are larger than + `l1_cutoff` are treated as `L1` values """ def __init__(self, l1_cutoff=1.0, **kwargs): diff --git a/keras_cv/metrics/coco/__init__.py b/keras_cv/metrics/coco/__init__.py index 21adb9125a..7ec7884e5e 100644 --- a/keras_cv/metrics/coco/__init__.py +++ b/keras_cv/metrics/coco/__init__.py @@ -17,7 +17,7 @@ from keras_cv.metrics.coco.pycoco_wrapper import compute_pycoco_metrics except ImportError: print( - "You do not have pycocotools installed, so KerasCV pycoco metrics are not available. " - "Please run `pip install pycocotools`." + "You do not have pycocotools installed, so KerasCV pycoco metrics are" + "not available. Please run `pip install pycocotools`." ) pass diff --git a/keras_cv/metrics/coco/pycoco_wrapper.py b/keras_cv/metrics/coco/pycoco_wrapper.py index 3afebe4264..ca9ea0c89c 100644 --- a/keras_cv/metrics/coco/pycoco_wrapper.py +++ b/keras_cv/metrics/coco/pycoco_wrapper.py @@ -49,8 +49,8 @@ def __init__(self, gt_dataset=None): """Instantiates a COCO-style API object. Args: eval_type: either 'box' or 'mask'. - annotation_file: a JSON file that stores annotations of the eval dataset. - This is required if `gt_dataset` is not provided. + annotation_file: a JSON file that stores annotations of the eval + dataset. This is required if `gt_dataset` is not provided. gt_dataset: the groundtruth eval datatset in COCO API format. """ @@ -63,14 +63,14 @@ def __init__(self, gt_dataset=None): def loadRes(self, predictions): """Loads result file and return a result api object. Args: - predictions: a list of dictionary each representing an annotation in COCO - format. The required fields are `image_id`, `category_id`, `score`, - `bbox`, `segmentation`. + predictions: a list of dictionary each representing an annotation in + COCO format. The required fields are `image_id`, `category_id`, + `score`, `bbox`, `segmentation`. Returns: res: result COCO api object. Raises: - ValueError: if the set of image id from predictions is not the subset of - the set of image id of the groundtruth dataset. + ValueError: if the set of image id from predictions is not the subset + of the set of image id of the groundtruth dataset. """ res = COCO() res.dataset["images"] = copy.deepcopy(self.dataset["images"]) diff --git a/keras_cv/metrics/object_detection/box_coco_metrics.py b/keras_cv/metrics/object_detection/box_coco_metrics.py index 9def0908ab..a870c1d716 100644 --- a/keras_cv/metrics/object_detection/box_coco_metrics.py +++ b/keras_cv/metrics/object_detection/box_coco_metrics.py @@ -85,18 +85,18 @@ class BoxCOCOMetrics(keras.metrics.Metric): bounding_box_format: the bounding box format for inputs. evaluate_freq: the number of steps to run before each evaluation. Due to the high computational cost of metric evaluation the final - results are only updated once every `evaluate_freq` steps. Higher + results are only updated once every `evaluate_freq` steps. Higher values will allow for faster training times, while lower numbers allow for higher numerical precision in metric reporting. Usage: `BoxCOCOMetrics()` can be used like any standard metric with any - KerasCV object detection model. Inputs to `y_true` must be KerasCV bounding + KerasCV object detection model. Inputs to `y_true` must be KerasCV bounding box dictionaries, `{"classes": classes, "boxes": boxes}`, and `y_pred` must follow the same format with an additional `confidence` key. Unfortunately, at the moment `BoxCOCOMetrics()` are not TPU compatible with - the `fit()` API. If you wish to evaluate `BoxCOCOMetrics()` for a model + the `fit()` API. If you wish to evaluate `BoxCOCOMetrics()` for a model trained on TPU, we recommend using the `model.predict()` API and manually updating the metric state with the results. @@ -226,9 +226,9 @@ def update_state(self, y_true, y_pred, sample_weight=None): self.ground_truths.append(y_true) self.predictions.append(y_pred) - # compute on first step so we don't have an inconsistent list of metrics - # in our train_step() results. This will just populate the metrics with - # `0.0` until we get to `evaluate_freq`. + # Compute on first step, so we don't have an inconsistent list of + # metrics in our train_step() results. This will just populate the + # metrics with `0.0` until we get to `evaluate_freq`. if self._eval_step_count % self.evaluate_freq == 0: self._cached_result = self._compute_result() diff --git a/keras_cv/models/__internal__/darknet_utils.py b/keras_cv/models/__internal__/darknet_utils.py index 357a0a7978..905d68bcbb 100644 --- a/keras_cv/models/__internal__/darknet_utils.py +++ b/keras_cv/models/__internal__/darknet_utils.py @@ -27,20 +27,21 @@ def DarknetConvBlock( filters, kernel_size, strides, use_bias=False, activation="silu", name=None ): - """The basic conv block used in Darknet. Applies Conv2D followed by a BatchNorm. + """The basic conv block used in Darknet. Applies Conv2D followed by a + BatchNorm. Args: - filters: Integer, the dimensionality of the output space (i.e. the number of - output filters in the convolution). - kernel_size: An integer or tuple/list of 2 integers, specifying the height - and width of the 2D convolution window. Can be a single integer to specify - the same value both dimensions. - strides: An integer or tuple/list of 2 integers, specifying the strides of - the convolution along the height and width. Can be a single integer to - the same value both dimensions. + filters: Integer, the dimensionality of the output space (i.e. the + number of output filters in the convolution). + kernel_size: An integer or tuple/list of 2 integers, specifying the + height and width of the 2D convolution window. Can be a single + integer to specify the same value both dimensions. + strides: An integer or tuple/list of 2 integers, specifying the strides + of the convolution along the height and width. Can be a single + integer to the same value both dimensions. use_bias: Boolean, whether the layer uses a bias vector. - activation: the activation applied after the BatchNorm layer. One of "silu", - "relu" or "leaky_relu". Defaults to "silu". + activation: the activation applied after the BatchNorm layer. One of + "silu", "relu" or "leaky_relu", defaults to "silu". name: the prefix for the layer names used in the block. """ @@ -73,8 +74,8 @@ def ResidualBlocks(filters, num_blocks, name=None): """A residual block used in DarkNet models, repeated `num_blocks` times. Args: - filters: Integer, the dimensionality of the output spaces (i.e. the number of - output filters in used the blocks). + filters: Integer, the dimensionality of the output spaces (i.e. the + number of output filters in used the blocks). num_blocks: number of times the residual connections are repeated name: the prefix for the layer names used in the block. @@ -132,18 +133,20 @@ def SpatialPyramidPoolingBottleneck( """Spatial pyramid pooling layer used in YOLOv3-SPP Args: - filters: Integer, the dimensionality of the output spaces (i.e. the number of - output filters in used the blocks). - hidden_filters: Integer, the dimensionality of the intermediate bottleneck space - (i.e. the number of output filters in the bottleneck convolution). If None, - it will be equal to filters. Defaults to None. - kernel_sizes: A list or tuple representing all the pool sizes used for the - pooling layers. Defaults to (5, 9, 13). - activation: Activation for the conv layers. Defaults to "silu". + filters: Integer, the dimensionality of the output spaces (i.e. the + number of output filters in used the blocks). + hidden_filters: Integer, the dimensionality of the intermediate + bottleneck space (i.e. the number of output filters in the + bottleneck convolution). If None, it will be equal to filters. + Defaults to None. + kernel_sizes: A list or tuple representing all the pool sizes used for + the pooling layers, defaults to (5, 9, 13). + activation: Activation for the conv layers, defaults to "silu". name: the prefix for the layer names used in the block. Returns: - a function that takes an input Tensor representing an SpatialPyramidPoolingBottleneck. + a function that takes an input Tensor representing an + SpatialPyramidPoolingBottleneck. """ if name is None: name = f"spp{backend.get_uid('spp')}" @@ -191,16 +194,16 @@ def DarknetConvBlockDepthwise( """The depthwise conv block used in CSPDarknet. Args: - filters: Integer, the dimensionality of the output space (i.e. the number of - output filters in the final convolution). - kernel_size: An integer or tuple/list of 2 integers, specifying the height - and width of the 2D convolution window. Can be a single integer to specify - the same value both dimensions. - strides: An integer or tuple/list of 2 integers, specifying the strides of - the convolution along the height and width. Can be a single integer to - the same value both dimensions. + filters: Integer, the dimensionality of the output space (i.e. the + number of output filters in the final convolution). + kernel_size: An integer or tuple/list of 2 integers, specifying the + height and width of the 2D convolution window. Can be a single + integer to specify the same value both dimensions. + strides: An integer or tuple/list of 2 integers, specifying the strides + of the convolution along the height and width. Can be a single + integer to the same value both dimensions. activation: the activation applied after the final layer. One of "silu", - "relu" or "leaky_relu". Defaults to "silu". + "relu" or "leaky_relu", defaults to "silu". name: the prefix for the layer names used in the block. """ @@ -236,17 +239,18 @@ class CrossStagePartial(layers.Layer): """A block used in Cross Stage Partial Darknet. Args: - filters: Integer, the dimensionality of the output space (i.e. the number of - output filters in the final convolution). - num_bottlenecks: an integer representing the number of blocks added in the - layer bottleneck. + filters: Integer, the dimensionality of the output space (i.e. the + number of output filters in the final convolution). + num_bottlenecks: an integer representing the number of blocks added in + the layer bottleneck. residual: a boolean representing whether the value tensor before the - bottleneck should be added to the output of the bottleneck as a residual. - Defaults to True. - use_depthwise: a boolean value used to decide whether a depthwise conv block - should be used over a regular darknet block. Defaults to False + bottleneck should be added to the output of the bottleneck as a + residual, defaults to True. + use_depthwise: a boolean value used to decide whether a depthwise conv + block should be used over a regular darknet block, defaults to + False. activation: the activation applied after the final layer. One of "silu", - "relu" or "leaky_relu". Defaults to "silu". + "relu" or "leaky_relu", defaults to "silu". """ def __init__( @@ -341,18 +345,19 @@ def get_config(self): def Focus(name=None): - """A block used in CSPDarknet to focus information into channels of the image. + """A block used in CSPDarknet to focus information into channels of the + image. - If the dimensions of a batch input is (batch_size, width, height, channels), this - layer converts the image into size (batch_size, width/2, height/2, 4*channels). - See [the original discussion on YoloV5 Focus Layer](https://github.com/ultralytics/yolov5/discussions/3181). + If the dimensions of a batch input is (batch_size, width, height, channels), + this layer converts the image into size (batch_size, width/2, height/2, + 4*channels). See [the original discussion on YoloV5 Focus Layer](https://github.com/ultralytics/yolov5/discussions/3181). Args: name: the name for the lambda layer used in the block. Returns: a function that takes an input Tensor representing a Focus layer. - """ + """ # noqa: E501 def apply(x): return layers.Lambda( diff --git a/keras_cv/models/__internal__/unet.py b/keras_cv/models/__internal__/unet.py index 88c439ae3a..f3c41d260a 100644 --- a/keras_cv/models/__internal__/unet.py +++ b/keras_cv/models/__internal__/unet.py @@ -179,8 +179,8 @@ def UNet( Args: input_shape: the rank 3 shape of the input to the UNet - down_block_configs: a list of (filter_count, num_blocks) tuples indicating the - number of filters and sub-blocks in each down block + down_block_configs: a list of (filter_count, num_blocks) tuples + indicating the number of filters and sub-blocks in each down block up_block_configs: a list of filter counts, one for each up block down_block: a downsampling block up_block: an upsampling block diff --git a/keras_cv/models/backbones/backbone.py b/keras_cv/models/backbones/backbone.py index 37e5297059..02d0ad64d6 100644 --- a/keras_cv/models/backbones/backbone.py +++ b/keras_cv/models/backbones/backbone.py @@ -25,7 +25,7 @@ class Backbone(keras.Model): """Base class for Backbone models. Backbones are reusable layers of models trained on a standard task such as - Imagenet classifcation that can be reused in other tasks. + Imagenet classification that can be reused in other tasks. """ def __init__(self, *args, **kwargs): @@ -53,7 +53,8 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return {} @classmethod @@ -63,7 +64,8 @@ def from_preset( load_weights=None, **kwargs, ): - """Instantiate {{model_name}} model from preset architecture and weights. + """Instantiate {{model_name}} model from preset architecture and + weights. Args: preset: string. Must be one of "{{preset_names}}". @@ -124,11 +126,11 @@ def from_preset( return model def __init_subclass__(cls, **kwargs): - # Use __init_subclass__ to setup a correct docstring for from_preset. + # Use __init_subclass__ to set up a correct docstring for from_preset. super().__init_subclass__(**kwargs) # If the subclass does not define from_preset, assign a wrapper so that - # each class can have an distinct docstring. + # each class can have a distinct docstring. if "from_preset" not in cls.__dict__: def from_preset(calling_cls, *args, **kwargs): diff --git a/keras_cv/models/backbones/resnet_v1/resnet_v1_backbone.py b/keras_cv/models/backbones/resnet_v1/resnet_v1_backbone.py index 7a2acac3a4..bc0eebef39 100644 --- a/keras_cv/models/backbones/resnet_v1/resnet_v1_backbone.py +++ b/keras_cv/models/backbones/resnet_v1/resnet_v1_backbone.py @@ -13,8 +13,9 @@ # limitations under the License. """ResNet models for KerasCV. Reference: - - [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385) (CVPR 2015) - - [Based on the original keras.applications ResNet](https://github.com/keras-team/keras/blob/master/keras/applications/resnet.py) + - [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385) + (CVPR 2015) + - [Based on the original keras.applications ResNet](https://github.com/keras-team/keras/blob/master/keras/applications/resnet.py) # noqa: E501 """ import copy @@ -45,8 +46,8 @@ def apply_basic_block( Args: x: input tensor. filters: int, filters of the basic layer. - kernel_size: int, kernel size of the bottleneck layer. Defaults to 3. - stride: int, stride of the first layer. Defaults to 1. + kernel_size: int, kernel size of the bottleneck layer, defaults to 3. + stride: int, stride of the first layer, defaults to 1. conv_shortcut: bool, uses convolution shortcut if `True`. If `False` (default), uses identity or pooling shortcut, based on stride. name: string, optional prefix for the layer names used in the block. @@ -109,8 +110,8 @@ def apply_block( Args: x: input tensor. filters: int, filters of the basic layer. - kernel_size: int, kernel size of the bottleneck layer. Defaults to 3. - stride: int, stride of the first layer. Defaults to 1. + kernel_size: int, kernel size of the bottleneck layer, defaults to 3. + stride: int, stride of the first layer, defaults to 1. conv_shortcut: bool, uses convolution shortcut if `True`. If `False` (default), uses identity or pooling shortcut, based on stride. name: string, optional prefix for the layer names used in the block. @@ -181,7 +182,8 @@ def apply_stack( x: input tensor. filters: int, filters of the layer in a block. blocks: int, blocks in the stacked blocks. - stride: int, stride of the first layer in the first block. Defaults to 2. + stride: int, stride of the first layer in the first block, defaults to + 2. name: string, optional prefix for the layer names used in the block. block_type: string, one of "basic_block" or "block". The block type to stack. Use "basic_block" for ResNet18 and ResNet34. @@ -241,7 +243,7 @@ class ResNetBackbone(Backbone): stackwise_blocks: list of ints, number of blocks for each stack in the model. stackwise_strides: list of ints, stride for each stack in the model. - include_rescaling: bool, whether or not to Rescale the inputs. If set + include_rescaling: bool, whether to rescale the inputs. If set to `True`, inputs will be passed through a `Rescaling(1/255.0)` layer. input_shape: optional shape tuple, defaults to (None, None, 3). @@ -267,7 +269,7 @@ class ResNetBackbone(Backbone): ) output = model(input_data) ``` - """ + """ # noqa: E501 def __init__( self, @@ -350,7 +352,8 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return copy.deepcopy(backbone_presets_with_weights) @@ -369,7 +372,7 @@ def presets_with_weights(cls): [guide to transfer learning & fine-tuning](https://keras.io/guides/transfer_learning/). Args: - include_rescaling: bool, whether or not to Rescale the inputs. If set + include_rescaling: bool, whether to rescale the inputs. If set to `True`, inputs will be passed through a `Rescaling(1/255.0)` layer. input_shape: optional shape tuple, defaults to (None, None, 3). @@ -384,7 +387,7 @@ def presets_with_weights(cls): model = ResNet{num_layers}Backbone() output = model(input_data) ``` -""" +""" # noqa: E501 class ResNet18Backbone(ResNetBackbone): @@ -412,7 +415,8 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return {} @@ -441,7 +445,8 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return {} @@ -474,7 +479,8 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return cls.presets @@ -503,7 +509,8 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return {} @@ -532,7 +539,8 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return {} diff --git a/keras_cv/models/backbones/resnet_v1/resnet_v1_backbone_presets.py b/keras_cv/models/backbones/resnet_v1/resnet_v1_backbone_presets.py index ee11cf07d5..c72504cb19 100644 --- a/keras_cv/models/backbones/resnet_v1/resnet_v1_backbone_presets.py +++ b/keras_cv/models/backbones/resnet_v1/resnet_v1_backbone_presets.py @@ -123,8 +123,8 @@ }, "class_name": "keras_cv.models>ResNetBackbone", "config": backbone_presets_no_weights["resnet50"]["config"], - "weights_url": "https://storage.googleapis.com/keras-cv/models/resnet50/imagenet/classification-v0-notop.h5", - "weights_hash": "dc5f6d8f929c78d0fc192afecc67b11ac2166e9d8b9ef945742368ae254c07af", + "weights_url": "https://storage.googleapis.com/keras-cv/models/resnet50/imagenet/classification-v0-notop.h5", # noqa: E501 + "weights_hash": "dc5f6d8f929c78d0fc192afecc67b11ac2166e9d8b9ef945742368ae254c07af", # noqa: E501 }, } diff --git a/keras_cv/models/backbones/resnet_v1/resnet_v1_backbone_presets_test.py b/keras_cv/models/backbones/resnet_v1/resnet_v1_backbone_presets_test.py index d45f00a138..d96182a17c 100644 --- a/keras_cv/models/backbones/resnet_v1/resnet_v1_backbone_presets_test.py +++ b/keras_cv/models/backbones/resnet_v1/resnet_v1_backbone_presets_test.py @@ -31,7 +31,7 @@ class ResNetPresetSmokeTest(tf.test.TestCase, parameterized.TestCase): """ A smoke test for ResNet presets we run continuously. This only tests the smallest weights we have available. Run with: - `pytest keras_cv/models/backbones/resnet_v1/resnet_v1_backbone_presets_test.py --run_large` + `pytest keras_cv/models/backbones/resnet_v1/resnet_v1_backbone_presets_test.py --run_large` # noqa: E501 """ def setUp(self): @@ -110,7 +110,7 @@ class ResNetPresetFullTest(tf.test.TestCase, parameterized.TestCase): Test the full enumeration of our preset. This every presets for ResNet and is only run manually. Run with: - `pytest keras_cv/models/backbones/resnet_v1/resnet_v1_backbone_presets_test.py --run_extra_large` + `pytest keras_cv/models/backbones/resnet_v1/resnet_v1_backbone_presets_test.py --run_extra_large` # noqa: E501 """ def test_load_resnet(self): diff --git a/keras_cv/models/backbones/resnet_v2/resnet_v2_backbone.py b/keras_cv/models/backbones/resnet_v2/resnet_v2_backbone.py index 21ae701561..bc5bab4a69 100644 --- a/keras_cv/models/backbones/resnet_v2/resnet_v2_backbone.py +++ b/keras_cv/models/backbones/resnet_v2/resnet_v2_backbone.py @@ -15,7 +15,7 @@ Reference: - [Identity Mappings in Deep Residual Networks](https://arxiv.org/abs/1603.05027) (ECCV 2016) - [Based on the original keras.applications ResNet](https://github.com/keras-team/keras/blob/master/keras/applications/resnet_v2.py) -""" +""" # noqa: E501 import copy @@ -51,8 +51,8 @@ def apply_basic_block( Args: x: input tensor. filters: int, filters of the basic layer. - kernel_size: int, kernel size of the bottleneck layer. Defaults to 3. - stride: int, stride of the first layer. Defaults to 1. + kernel_size: int, kernel size of the bottleneck layer, defaults to 3. + stride: int, stride of the first layer, defaults to 1. dilation: int, the dilation rate to use for dilated convolution. Defaults to 1. conv_shortcut: bool, uses convolution shortcut if `True`. If `False` @@ -129,8 +129,8 @@ def apply_block( Args: x: input tensor. filters: int, filters of the basic layer. - kernel_size: int, kernel size of the bottleneck layer. Defaults to 3. - stride: int, stride of the first layer. Defaults to 1. + kernel_size: int, kernel size of the bottleneck layer, defaults to 3. + stride: int, stride of the first layer, defaults to 1. dilation: int, the dilation rate to use for dilated convolution. Defaults to 1. conv_shortcut: bool, uses convolution shortcut if `True`. If `False` @@ -211,8 +211,9 @@ def apply_stack( x: input tensor. filters: int, filters of the layer in a block. blocks: int, blocks in the stacked blocks. - stride: int, stride of the first layer in the first block. Defaults to 2. - dilation: int, the dilation rate to use for dilated convolution. + stride: int, stride of the first layer in the first block, defaults + to 2. + dilations: int, the dilation rate to use for dilated convolution. Defaults to 1. name: string, optional prefix for the layer names used in the block. block_type: string, one of "basic_block" or "block". The block type to @@ -276,7 +277,7 @@ class ResNetV2Backbone(Backbone): stackwise_blocks: list of ints, number of blocks for each stack in the model. stackwise_strides: list of ints, stride for each stack in the model. - include_rescaling: bool, whether or not to Rescale the inputs. If set + include_rescaling: bool, whether to rescale the inputs. If set to `True`, inputs will be passed through a `Rescaling(1/255.0)` layer. stackwise_dilations: list of ints, dilation for each stack in the @@ -305,7 +306,7 @@ class ResNetV2Backbone(Backbone): ) output = model(input_data) ``` - """ + """ # noqa: E501 def __init__( self, @@ -399,7 +400,8 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return copy.deepcopy(backbone_presets_with_weights) @@ -418,7 +420,7 @@ def presets_with_weights(cls): [guide to transfer learning & fine-tuning](https://keras.io/guides/transfer_learning/). Args: - include_rescaling: bool, whether or not to Rescale the inputs. If set + include_rescaling: bool, whether to rescale the inputs. If set to `True`, inputs will be passed through a `Rescaling(1/255.0)` layer. input_shape: optional shape tuple, defaults to (None, None, 3). @@ -433,7 +435,7 @@ def presets_with_weights(cls): model = ResNet{num_layers}V2Backbone() output = model(input_data) ``` -""" +""" # noqa: E501 class ResNet18V2Backbone(ResNetV2Backbone): @@ -461,7 +463,8 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return {} @@ -490,7 +493,8 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return {} @@ -523,7 +527,8 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return cls.presets @@ -552,7 +557,8 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return {} @@ -581,7 +587,8 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return {} diff --git a/keras_cv/models/backbones/resnet_v2/resnet_v2_backbone_presets.py b/keras_cv/models/backbones/resnet_v2/resnet_v2_backbone_presets.py index 2282685368..bdd851310b 100644 --- a/keras_cv/models/backbones/resnet_v2/resnet_v2_backbone_presets.py +++ b/keras_cv/models/backbones/resnet_v2/resnet_v2_backbone_presets.py @@ -122,8 +122,8 @@ }, "class_name": "keras_cv.models>ResNetV2Backbone", "config": backbone_presets_no_weights["resnet50_v2"]["config"], - "weights_url": "https://storage.googleapis.com/keras-cv/models/resnet50v2/imagenet/classification-v2-notop.h5", - "weights_hash": "e711c83d6db7034871f6d345a476c8184eab99dbf3ffcec0c1d8445684890ad9", + "weights_url": "https://storage.googleapis.com/keras-cv/models/resnet50v2/imagenet/classification-v2-notop.h5", # noqa: E501 + "weights_hash": "e711c83d6db7034871f6d345a476c8184eab99dbf3ffcec0c1d8445684890ad9", # noqa: E501 }, } diff --git a/keras_cv/models/backbones/resnet_v2/resnet_v2_backbone_presets_test.py b/keras_cv/models/backbones/resnet_v2/resnet_v2_backbone_presets_test.py index bbf25e33bd..c1fa4e389b 100644 --- a/keras_cv/models/backbones/resnet_v2/resnet_v2_backbone_presets_test.py +++ b/keras_cv/models/backbones/resnet_v2/resnet_v2_backbone_presets_test.py @@ -31,7 +31,7 @@ class ResNetV2PresetSmokeTest(tf.test.TestCase, parameterized.TestCase): A smoke test for ResNetV2 presets we run continuously. This only tests the smallest weights we have available. Run with: `pytest keras_cv/models/backbones/resnet_v2/resnetv2_presets_test.py --run_large` - """ + """ # noqa: E501 def setUp(self): self.input_batch = tf.ones(shape=(8, 224, 224, 3)) @@ -86,7 +86,7 @@ class ResNetV2PresetFullTest(tf.test.TestCase, parameterized.TestCase): This every presets for ResNetV2 and is only run manually. Run with: `pytest keras_cv/models/backbones/resnet_v2/resnet_v2_presets_test.py --run_extra_large` - """ + """ # noqa: E501 def test_load_resnetv2(self): input_data = tf.ones(shape=(8, 224, 224, 3)) diff --git a/keras_cv/models/classification/image_classifier.py b/keras_cv/models/classification/image_classifier.py index 52195beee6..6d34f502a1 100644 --- a/keras_cv/models/classification/image_classifier.py +++ b/keras_cv/models/classification/image_classifier.py @@ -39,7 +39,7 @@ class ImageClassifier(Task): dimension of the backbone output. num_classes: int, number of classes to predict. pooling: str, type of pooling layer. Must be one of "avg", "max". - activation: Optional `str` or callable. Defaults to "softmax". The + activation: Optional `str` or callable, defaults to "softmax". The activation function to use on the Dense layer. Set `activation=None` to return the output logits. @@ -129,12 +129,14 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return copy.deepcopy( {**backbone_presets_with_weights, **classifier_presets} ) @classproperty def backbone_presets(cls): - """Dictionary of preset names and configurations of compatible backbones.""" + """Dictionary of preset names and configurations of compatible + backbones.""" return copy.deepcopy(backbone_presets) diff --git a/keras_cv/models/classification/image_classifier_presets.py b/keras_cv/models/classification/image_classifier_presets.py index 4a6368df05..b8962bf37c 100644 --- a/keras_cv/models/classification/image_classifier_presets.py +++ b/keras_cv/models/classification/image_classifier_presets.py @@ -33,7 +33,7 @@ "pooling": "avg", "activation": "softmax", }, - "weights_url": "https://storage.googleapis.com/keras-cv/models/resnet50v2/imagenet-classifier-v0.h5", - "weights_hash": "77fa9f1cd1de0e202309e51d4e598e441d1111dacb6c41a182b6c63f76ff26cd", + "weights_url": "https://storage.googleapis.com/keras-cv/models/resnet50v2/imagenet-classifier-v0.h5", # noqa: E501 + "weights_hash": "77fa9f1cd1de0e202309e51d4e598e441d1111dacb6c41a182b6c63f76ff26cd", # noqa: E501 }, } diff --git a/keras_cv/models/convmixer.py b/keras_cv/models/convmixer.py index fa601e0166..475aa7501b 100644 --- a/keras_cv/models/convmixer.py +++ b/keras_cv/models/convmixer.py @@ -70,25 +70,26 @@ learning & fine-tuning](https://keras.io/guides/transfer_learning/). Args: - include_rescaling: bool, whether or not to rescale the inputs. If set to True, - inputs will be passed through a `Rescaling(1/255.0)` layer. - include_top: bool, whether to include the fully-connected layer at the top of the - network. If provided, num_classes must be provided. - num_classes: integer, optional number of classes to classify images into. Only to be - specified if `include_top` is True. + include_rescaling: bool, whether to rescale the inputs. If set to + True, inputs will be passed through a `Rescaling(1/255.0)` layer. + include_top: bool, whether to include the fully-connected layer at the + top of the network. If provided, num_classes must be provided. + num_classes: integer, optional number of classes to classify images + into. Only to be specified if `include_top` is True. weights: one of `None` (random initialization), a pretrained weight file - path, or a reference to pre-trained weights (e.g. 'imagenet/classification') - (see available pre-trained weights in weights.py) + path, or a reference to pre-trained weights (e.g. + 'imagenet/classification')(see available pre-trained weights in + weights.py) input_shape: optional shape tuple, defaults to (None, None, 3). input_tensor: optional Keras tensor (i.e., output of `layers.Input()`) to use as image input for the model. pooling: optional pooling mode for feature extraction when `include_top` is `False`. - - `None` means that the output of the model will be the 4D tensor output - of the last convolutional block. - - `avg` means that global average pooling will be applied to the output - of the last convolutional block, and thus the output of the model will - be a 2D tensor. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional block. + - `avg` means that global average pooling will be applied to the + output of the last convolutional block, and thus the output of + the model will be a 2D tensor. - `max` means that global max pooling will be applied. name: string, optional name to pass to the model, defaults to "{name}". @@ -146,32 +147,31 @@ class ConvMixer(keras.Model): depth: integer, number of ConvMixer Layer. patch_size: integer, size of the patches. kernel_size: integer, kernel size for Conv2D layers. - include_top: bool, whether to include the fully-connected - layer at the top of the network. + include_top: bool, whether to include the fully-connected layer at the + top of the network. include_rescaling: bool, whether to rescale the inputs. If set to True, inputs will be passed through a `Rescaling(1/255.0)` layer. - name: string, optional name to pass to the model, defaults to "ConvMixer". - weights: one of `None` (random initialization) - or the path to the weights file to be loaded. + name: string, optional name to pass to the model, defaults to + "ConvMixer". + weights: one of `None` (random initialization) or the path to the + weights file to be loaded. input_shape: optional shape tuple, defaults to (None, None, 3). input_tensor: optional Keras tensor (i.e., output of `layers.Input()`) to use as image input for the model. - pooling: optional pooling mode for feature extraction - when `include_top` is `False`. - - `None` means that the output of the model will be - the 4D tensor output of the - last convolutional layer. - - `avg` means that global average pooling - will be applied to the output of the - last convolutional layer, and thus - the output of the model will be a 2D tensor. - - `max` means that global max pooling will - be applied. + pooling: optional pooling mode for feature extraction when `include_top` + is `False`. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional layer. + - `avg` means that global average pooling will be applied to the + output of the last convolutional layer, and thus the output of + the model will be a 2D tensor. + - `max` means that global max pooling will be applied. num_classes: integer, optional number of classes to classify images into. Only to be specified if `include_top` is True. - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. **kwargs: Pass-through keyword arguments to `keras.Model`. Returns: @@ -197,8 +197,9 @@ def __init__( ): if weights and not tf.io.gfile.exists(weights): raise ValueError( - "The `weights` argument should be either `None` or the path to the " - f"weights file to be loaded. Weights file not found at location: {weights}" + "The `weights` argument should be either `None` or the path to " + "the weights file to be loaded. Weights file not found at " + f"location: {weights}" ) if include_top and not num_classes: diff --git a/keras_cv/models/convnext.py b/keras_cv/models/convnext.py index ed7e61a89b..0f5cbc0312 100644 --- a/keras_cv/models/convnext.py +++ b/keras_cv/models/convnext.py @@ -55,15 +55,15 @@ } BASE_DOCSTRING = """Instantiates the {name} architecture. - - [A ConvNet for the 2020s](https://arxiv.org/abs/2201.03545) - (CVPR 2022) + - [A ConvNet for the 2020s](https://arxiv.org/abs/2201.03545) (CVPR 2022) + This function returns a Keras {name} model. Args: - include_rescaling: bool, whether or not to Rescale the inputs. If set + include_rescaling: bool, whether to rescale the inputs. If set to `True`, inputs will be passed through a `Rescaling(1/255.0)` layer. - include_top: bool, whether to include the fully-connected layer at - the top of the network. If provided, `num_classes` must be provided. + include_top: bool, whether to include the fully-connected layer at the + top of the network. If provided, `num_classes` must be provided. depths: an iterable containing depths for each individual stages. projection_dims: An iterable containing output number of channels of each individual stages. @@ -72,25 +72,27 @@ layer_scale_init_value: layer scale coefficient, if 0.0, layer scaling won't be used. weights: one of `None` (random initialization), a pretrained weight file - path, or a reference to pre-trained weights (e.g. 'imagenet/classification') - (see available pre-trained weights in weights.py) + path, or a reference to pre-trained weights (e.g. + 'imagenet/classification')(see available pre-trained weights in + weights.py) input_shape: optional shape tuple, defaults to (None, None, 3). input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model. pooling: optional pooling mode for feature extraction when `include_top` is `False`. - - `None` means that the output of the model will be the 4D tensor output - of the last convolutional block. - - `avg` means that global average pooling will be applied to the output - of the last convolutional block, and thus the output of the model will - be a 2D tensor. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional block. + - `avg` means that global average pooling will be applied to the + output of the last convolutional block, and thus the output of + the model will be a 2D tensor. - `max` means that global max pooling will be applied. - num_classes: optional int, number of classes to classify images into (only - to be specified if `include_top` is `True`). - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. - name: (Optional) name to pass to the model. Defaults to "{name}". + num_classes: optional int, number of classes to classify images into + (only to be specified if `include_top` is `True`). + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. + name: (Optional) name to pass to the model, defaults to "{name}". Returns: A `keras.Model` instance. @@ -159,7 +161,7 @@ def apply_block( name: name to path to the keras layer. Returns: A function representing a ConvNeXtBlock block. - """ + """ # noqa: E501 if name is None: name = "prestem" + str(backend.get_uid("prestem")) @@ -217,45 +219,47 @@ def apply_head(x, num_classes, activation="softmax", name=None): class ConvNeXt(keras.Model): """Instantiates ConvNeXt architecture given specific configuration. Args: - include_rescaling: bool, whether or not to Rescale the inputs. If set + include_rescaling: bool, whether to rescale the inputs. If set to `True`, inputs will be passed through a `Rescaling(1/255.0)` layer. - include_top: bool, whether to include the fully-connected layer at - the top of the network. If provided, `num_classes` must be provided. - depths: An iterable containing depths for each individual stages. - projection_dims: An iterable containing output number of channels of - each individual stages. - drop_path_rate: Stochastic depth probability. If 0.0, then stochastic + include_top: bool, whether to include the fully-connected layer at the + top of the network. If provided, `num_classes` must be provided. + depths: An iterable containing depths for each individual stages. + projection_dims: An iterable containing output number of channels of + each individual stages. + drop_path_rate: Stochastic depth probability. If 0.0, then stochastic depth won't be used. - layer_scale_init_value: Layer scale coefficient. If 0.0, layer scaling + layer_scale_init_value: Layer scale coefficient. If 0.0, layer scaling won't be used. - weights: one of `None` (random initialization), a pretrained weight file - path, or a reference to pre-trained weights (e.g. 'imagenet/classification') - (see available pre-trained weights in weights.py) - input_shape: optional shape tuple, defaults to (None, None, 3). + weights: one of `None` (random initialization), a pretrained weight file + path, or a reference to pre-trained weights (e.g. + 'imagenet/classification')(see available pre-trained weights in + weights.py) + input_shape: optional shape tuple, defaults to (None, None, 3). input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model. - pooling: optional pooling mode for feature extraction + pooling: optional pooling mode for feature extraction when `include_top` is `False`. - - `None` means that the output of the model will be the 4D tensor output - of the last convolutional block. - - `avg` means that global average pooling will be applied to the output - of the last convolutional block, and thus the output of the model will - be a 2D tensor. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional block. + - `avg` means that global average pooling will be applied to the + output of the last convolutional block, and thus the output of + the model will be a 2D tensor. - `max` means that global max pooling will be applied. - num_classes: optional int, number of classes to classify images into (only - to be specified if `include_top` is `True`). - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. - name: (Optional) name to pass to the model. Defaults to "convnext". + num_classes: optional int, number of classes to classify images into + (only to be specified if `include_top` is `True`). + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. + name: (Optional) name to pass to the model, defaults to "convnext". Returns: A `keras.Model` instance. Raises: - ValueError: in case of invalid argument for `weights`, - or invalid input shape. - ValueError: if `classifier_activation` is not `softmax`, or `None` - when using a pretrained top layer. + ValueError: in case of invalid argument for `weights`, or invalid input + shape. + ValueError: if `classifier_activation` is not `softmax`, or `None` when + using a pretrained top layer. ValueError: if `include_top` is True but `num_classes` is not specified. """ diff --git a/keras_cv/models/csp_darknet.py b/keras_cv/models/csp_darknet.py index aa61e1598d..9bfc5d1a75 100644 --- a/keras_cv/models/csp_darknet.py +++ b/keras_cv/models/csp_darknet.py @@ -52,8 +52,8 @@ } BASE_DOCSTRING = """Represents the {name} architecture. The CSPDarkNet architectures are commonly used for detection tasks. It is - possible to extract the intermediate dark2 to dark5 layers from the model for - creating a feature pyramid Network. + possible to extract the intermediate dark2 to dark5 layers from the model + for creating a feature pyramid Network. Reference: - [YoloV4 Paper](https://arxiv.org/abs/1804.02767) - [CSPNet Paper](https://arxiv.org/pdf/1911.11929) @@ -62,43 +62,46 @@ For transfer learning use cases, make sure to read the [guide to transfer learning & fine-tuning](https://keras.io/guides/transfer_learning/). Args: - include_rescaling: bool, whether or not to rescale the inputs. If set to True, - inputs will be passed through a `Rescaling(1/255.0)` layer. - include_top: bool, whether to include the fully-connected layer at the top of - the network. If provided, `num_classes` must be provided. - use_depthwise: a boolean value used to decide whether a depthwise conv block - should be used over a regular darknet block. Defaults to False - num_classes: integer, optional number of classes to classify images into. Only to be - specified if `include_top` is True. + include_rescaling: bool, whether to rescale the inputs. If set to + True, inputs will be passed through a `Rescaling(1/255.0)` layer. + include_top: bool, whether to include the fully-connected layer at the + top of the network. If provided, `num_classes` must be provided. + use_depthwise: a boolean value used to decide whether a depthwise conv + block should be used over a regular darknet block, defaults to + False. + num_classes: integer, optional number of classes to classify images + into. Only to be specified if `include_top` is True. weights: one of `None` (random initialization), a pretrained weight file - path, or a reference to pre-trained weights (e.g. 'imagenet/classification') - (see available pre-trained weights in weights.py) + path, or a reference to pre-trained weights (e.g. + 'imagenet/classification')(see available pre-trained weights in + weights.py) input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model. input_shape: optional shape tuple, defaults to (None, None, 3). pooling: optional pooling mode for feature extraction when `include_top` is `False`. - - `None` means that the output of the model will be the 4D tensor output - of the last convolutional block. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional block. - `avg` means that global average pooling will be applied to the - output of the last convolutional block, and thus the output of the - model will be a 2D tensor. + output of the last convolutional block, and thus the output of + the model will be a 2D tensor. - `max` means that global max pooling will be applied. - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. name: string, optional name to pass to the model, defaults to "{name}". Returns: A `keras.Model` instance. -""" +""" # noqa: E501 @keras.utils.register_keras_serializable(package="keras_cv.models") class CSPDarkNet(keras.Model): """This class represents the CSPDarkNet architecture. - Although the DarkNet architecture is commonly used for detection tasks, it is - possible to extract the intermediate dark2 to dark5 layers from the model for - creating a feature pyramid Network. + Although the DarkNet architecture is commonly used for detection tasks, it + is possible to extract the intermediate dark2 to dark5 layers from the model + for creating a feature pyramid Network. Reference: - [YoloV4 Paper](https://arxiv.org/abs/1804.02767) - [CSPNet Paper](https://arxiv.org/pdf/1911.11929) @@ -107,39 +110,42 @@ class CSPDarkNet(keras.Model): For transfer learning use cases, make sure to read the [guide to transfer learning & fine-tuning](https://keras.io/guides/transfer_learning/). Args: - depth_multiplier: A float value used to calculate the base depth of the model - this changes based the detection model being used. - width_multiplier: A float value used to calculate the base width of the model - this changes based the detection model being used. - include_rescaling: bool ,whether or not to Rescale the inputs.If set to True, - inputs will be passed through a `Rescaling(1/255.0)` layer. - include_top: bool, whether to include the fully-connected layer at the top of - the network. If provided, `num_classes` must be provided. - use_depthwise: a boolean value used to decide whether a depthwise conv block - should be used over a regular darknet block. Defaults to False - num_classes: optional int,optional number of classes to classify images into, only to be - specified if `include_top` is True. + depth_multiplier: A float value used to calculate the base depth of the + model this changes based the detection model being used. + width_multiplier: A float value used to calculate the base width of the + model this changes based the detection model being used. + include_rescaling: bool, whether to rescale the inputs. If set to + True, inputs will be passed through a `Rescaling(1/255.0)` layer. + include_top: bool, whether to include the fully-connected layer at the + top of the network. If provided, `num_classes` must be provided. + use_depthwise: a boolean value used to decide whether a depthwise conv + block should be used over a regular darknet block, defaults to + False. + num_classes: optional int,optional number of classes to classify images + into, only to be specified if `include_top` is True. weights: one of `None` (random initialization), a pretrained weight file - path, or a reference to pre-trained weights (e.g. 'imagenet/classification') - (see available pre-trained weights in weights.py) + path, or a reference to pre-trained weights (e.g. + 'imagenet/classification')(see available pre-trained weights in + weights.py) input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model. input_shape: optional shape tuple, defaults to (None, None, 3). pooling: optional pooling mode for feature extraction when `include_top` is `False`. - - `None` means that the output of the model will be the 4D tensor output - of the last convolutional block. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional block. - `avg` means that global average pooling will be applied to the - output of the last convolutional block, and thus the output of the - model will be a 2D tensor. + output of the last convolutional block, and thus the output of + the model will be a 2D tensor. - `max` means that global max pooling will be applied. - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. - name: (Optional) name to pass to the model. Defaults to "CSPDarkNet". + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. + name: (Optional) name to pass to the model, defaults to "CSPDarkNet". Returns: A `keras.Model` instance. - """ + """ # noqa: E501 def __init__( self, @@ -159,14 +165,15 @@ def __init__( ): if weights and not tf.io.gfile.exists(weights): raise ValueError( - "The `weights` argument should be either `None` or the path to the " - f"weights file to be loaded. Weights file not found at location{weights}" + "The `weights` argument should be either `None` or the path to " + "the weights file to be loaded. Weights file not found at " + f"location{weights}" ) if include_top and not num_classes: raise ValueError( - "If `include_top` is True, you should specify `num_classes`. Received: " - f"num_classes={num_classes}" + "If `include_top` is True, you should specify `num_classes`. " + f"Received: num_classes={num_classes}" ) ConvBlock = ( diff --git a/keras_cv/models/darknet.py b/keras_cv/models/darknet.py index 84ee2819c6..63bf1d7d9c 100644 --- a/keras_cv/models/darknet.py +++ b/keras_cv/models/darknet.py @@ -33,24 +33,23 @@ BASE_DOCSTRING = """Represents the {name} architecture. Although the {name} architecture is commonly used for detection tasks, it is - possible to extract the intermediate dark2 to dark5 layers from the model for - creating a feature pyramid Network. + possible to extract the intermediate dark2 to dark5 layers from the model + for creating a feature pyramid Network. Reference: - [YoloV3 Paper](https://arxiv.org/abs/1804.02767) - [YoloV3 implementation](https://github.com/ultralytics/yolov3) For transfer learning use cases, make sure to read the - [guide to transfer learning & fine-tuning]( - https://keras.io/guides/transfer_learning/). + [guide to transfer learning & fine-tuning](https://keras.io/guides/transfer_learning/). Args: - include_rescaling: bool, whether or not to rescale the inputs. If set to True, - inputs will be passed through a `Rescaling(1/255.0)` layer. - include_top: bool, whether to include the fully-connected layer at the top of - the network. If provided, `num_classes` must be provided. - num_classes: integer, optional number of classes to classify images into. Only to be - specified if `include_top` is True. + include_rescaling: bool, whether to rescale the inputs. If set to + True, inputs will be passed through a `Rescaling(1/255.0)` layer. + include_top: bool, whether to include the fully-connected layer at the + top of the network. If provided, `num_classes` must be provided. + num_classes: integer, optional number of classes to classify images + into. Only to be specified if `include_top` is True. weights: one of `None` (random initialization), or a pretrained weight file path. input_shape: optional shape tuple, defaults to (None, None, 3). @@ -58,17 +57,17 @@ to use as image input for the model. pooling: optional pooling mode for feature extraction when `include_top` is `False`. - - `None` means that the output of the model will be the 4D tensor output - of the last convolutional block. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional block. - `avg` means that global average pooling will be applied to the - output of the last convolutional block, and thus the output of the - model will be a 2D tensor. + output of the last convolutional block, and thus the output of + the model will be a 2D tensor. - `max` means that global max pooling will be applied. name: string, optional name to pass to the model, defaults to "{name}". Returns: A `keras.Model` instance. -""" +""" # noqa: E501 @keras.utils.register_keras_serializable(package="keras_cv.models") @@ -77,24 +76,24 @@ class DarkNet(keras.Model): """Represents the DarkNet architecture. The DarkNet architecture is commonly used for detection tasks. It is - possible to extract the intermediate dark2 to dark5 layers from the model for - creating a feature pyramid Network. + possible to extract the intermediate dark2 to dark5 layers from the model + for creating a feature pyramid Network. Reference: - [YoloV3 Paper](https://arxiv.org/abs/1804.02767) - [YoloV3 implementation](https://github.com/ultralytics/yolov3) For transfer learning use cases, make sure to read the - [guide to transfer learning & fine-tuning]( - https://keras.io/guides/transfer_learning/). + [guide to transfer learning & fine-tuning](https://keras.io/guides/transfer_learning/). Args: - blocks: integer, numbers of building blocks from the layer dark2 to layer dark5. + blocks: integer, numbers of building blocks from the layer dark2 to + layer dark5. include_rescaling: bool, whether to rescale the inputs. If set to True, inputs will be passed through a `Rescaling(1/255.0)` layer. - include_top: bool, whether to include the fully-connected layer at the top of - the network. If provided, `num_classes` must be provided. - num_classes: integer, optional number of classes to classify images into. Only to be - specified if `include_top` is True. + include_top: bool, whether to include the fully-connected layer at the + top of the network. If provided, `num_classes` must be provided. + num_classes: integer, optional number of classes to classify images + into. Only to be specified if `include_top` is True. weights: one of `None` (random initialization) or a pretrained weight file path. input_shape: optional shape tuple, defaults to (None, None, 3). @@ -102,20 +101,21 @@ class DarkNet(keras.Model): to use as image input for the model. pooling: optional pooling mode for feature extraction when `include_top` is `False`. - - `None` means that the output of the model will be the 4D tensor output - of the last convolutional block. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional block. - `avg` means that global average pooling will be applied to the - output of the last convolutional block, and thus the output of the - model will be a 2D tensor. + output of the last convolutional block, and thus the output of + the model will be a 2D tensor. - `max` means that global max pooling will be applied. - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. name: string, optional name to pass to the model, defaults to "DarkNet". Returns: A `keras.Model` instance. - """ + """ # noqa: E501 def __init__( self, @@ -133,14 +133,15 @@ def __init__( ): if weights and not tf.io.gfile.exists(weights): raise ValueError( - "The `weights` argument should be either `None` or the path to the " - f"weights file to be loaded. Weights file not found at location: {weights}" + "The `weights` argument should be either `None` or the path to " + "the weights file to be loaded. Weights file not found at " + f"location: {weights}" ) if include_top and not num_classes: raise ValueError( - "If `include_top` is True, you should specify `num_classes`. Received: " - f"num_classes={num_classes}" + "If `include_top` is True, you should specify `num_classes`. " + f"Received: num_classes={num_classes}" ) inputs = utils.parse_model_inputs(input_shape, input_tensor) @@ -164,7 +165,8 @@ def __init__( # filters for the ResidualBlock outputs filters = [128, 256, 512, 1024] - # layer_num is used for naming the residual blocks (starts with dark2, hence 2) + # layer_num is used for naming the residual blocks + # (starts with dark2, hence 2) layer_num = 2 for filter, block in zip(filters, blocks): diff --git a/keras_cv/models/densenet.py b/keras_cv/models/densenet.py index 6d4c481925..5809e182b0 100644 --- a/keras_cv/models/densenet.py +++ b/keras_cv/models/densenet.py @@ -17,7 +17,7 @@ Reference: - [Densely Connected Convolutional Networks](https://arxiv.org/abs/1608.06993) - [Based on the Original keras.applications DenseNet](https://github.com/keras-team/keras/blob/master/keras/applications/densenet.py) -""" +""" # noqa: E501 from tensorflow import keras from tensorflow.keras import backend @@ -48,39 +48,41 @@ This function returns a Keras {name} model. - For transfer learning use cases, make sure to read the [guide to transfer - learning & fine-tuning](https://keras.io/guides/transfer_learning/). + For transfer learning use cases, make sure to read the + [guide to transfer learning & fine-tuning](https://keras.io/guides/transfer_learning/). Args: - include_rescaling: bool, whether or not to Rescale the inputs. If set + include_rescaling: bool, whether to rescale the inputs. If set to `True`, inputs will be passed through a `Rescaling(1/255.0)` layer. - include_top: bool, whether to include the fully-connected layer at - the top of the network. If provided, `num_classes` must be provided. - num_classes: optional int, number of classes to classify images into (only - to be specified if `include_top` is `True`). + include_top: bool, whether to include the fully-connected layer at the + top of the network. If provided, `num_classes` must be provided. + num_classes: optional int, number of classes to classify images into + (only to be specified if `include_top` is `True`). weights: one of `None` (random initialization), a pretrained weight file - path, or a reference to pre-trained weights (e.g. 'imagenet/classification') - (see available pre-trained weights in weights.py) + path, or a reference to pre-trained weights (e.g. + 'imagenet/classification')(see available pre-trained weights in + weights.py) input_shape: optional shape tuple, defaults to (None, None, 3). input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model. pooling: optional pooling mode for feature extraction when `include_top` is `False`. - - `None` means that the output of the model will be the 4D tensor output - of the last convolutional block. - - `avg` means that global average pooling will be applied to the output - of the last convolutional block, and thus the output of the model will - be a 2D tensor. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional block. + - `avg` means that global average pooling will be applied to the + output of the last convolutional block, and thus the output of + the model will be a 2D tensor. - `max` means that global max pooling will be applied. - name: (Optional) name to pass to the model. Defaults to "{name}". - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. + name: (Optional) name to pass to the model, defaults to "{name}". + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. Returns: A `keras.Model` instance. -""" +""" # noqa: E501 def apply_dense_block(x, blocks, name=None): @@ -91,7 +93,7 @@ def apply_dense_block(x, blocks, name=None): name: string, block label. Returns: - a function that takes an input Tensor representing a apply_dense_block. + a function that takes an input Tensor representing an apply_dense_block. """ if name is None: name = f"dense_block_{backend.get_uid('dense_block')}" @@ -109,7 +111,8 @@ def apply_transition_block(x, reduction, name=None): name: string, block label. Returns: - a function that takes an input Tensor representing a apply_transition_block. + a function that takes an input Tensor representing an + apply_transition_block. """ if name is None: name = f"transition_block_{backend.get_uid('transition_block')}" @@ -173,40 +176,42 @@ class DenseNet(keras.Model): This function returns a Keras DenseNet model. - For transfer learning use cases, make sure to read the [guide to transfer - learning & fine-tuning](https://keras.io/guides/transfer_learning/). + For transfer learning use cases, make sure to read the + [guide to transfer learning & fine-tuning](https://keras.io/guides/transfer_learning/). Args: blocks: numbers of building blocks for the four dense layers. - include_rescaling: bool, whether or not to Rescale the inputs. If set + include_rescaling: bool, whether to rescale the inputs. If set to `True`, inputs will be passed through a `Rescaling(1/255.0)` layer. - include_top: bool, whether to include the fully-connected layer at - the top of the network. If provided, `num_classes` must be provided. - num_classes: optional int, number of classes to classify images into (only - to be specified if `include_top` is `True`). + include_top: bool, whether to include the fully-connected layer at the + top of the network. If provided, `num_classes` must be provided. + num_classes: optional int, number of classes to classify images into + (only to be specified if `include_top` is `True`). weights: one of `None` (random initialization), a pretrained weight file - path, or a reference to pre-trained weights (e.g. 'imagenet/classification') - (see available pre-trained weights in weights.py) + path, or a reference to pre-trained weights (e.g. + 'imagenet/classification')(see available pre-trained weights in + weights.py) input_shape: optional shape tuple, defaults to (None, None, 3). input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model. pooling: optional pooling mode for feature extraction when `include_top` is `False`. - - `None` means that the output of the model will be the 4D tensor output - of the last convolutional block. - - `avg` means that global average pooling will be applied to the output - of the last convolutional block, and thus the output of the model will - be a 2D tensor. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional block. + - `avg` means that global average pooling will be applied to the + output of the last convolutional block, and thus the output of + the model will be a 2D tensor. - `max` means that global max pooling will be applied. - name: (Optional) name to pass to the model. Defaults to "DenseNet". - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. + name: (Optional) name to pass to the model, defaults to "DenseNet". + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. Returns: A `keras.Model` instance. - """ + """ # noqa: E501 def __init__( self, diff --git a/keras_cv/models/efficientnet_lite.py b/keras_cv/models/efficientnet_lite.py index f5945fab5b..560fd392e0 100644 --- a/keras_cv/models/efficientnet_lite.py +++ b/keras_cv/models/efficientnet_lite.py @@ -16,10 +16,10 @@ """EfficientNet Lite models for Keras. Reference: - - [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks]( - https://arxiv.org/abs/1905.11946) (ICML 2019) + - [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks](https://arxiv.org/abs/1905.11946) + (ICML 2019) - [Based on the original EfficientNet Lite's](https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet/lite) -""" +""" # noqa: E501 import copy import math @@ -118,48 +118,48 @@ BASE_DOCSTRING = """Instantiates the {name} architecture. Reference: - - [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks]( - https://arxiv.org/abs/1905.11946) (ICML 2019) + - [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks](https://arxiv.org/abs/1905.11946) + (ICML 2019) This function returns a Keras {name} model. - For image classification use cases, see - [this page for detailed examples]( - https://keras.io/api/applications/#usage-examples-for-image-classification-models). + For image classification use cases, see [this page for detailed examples](https://keras.io/api/applications/#usage-examples-for-image-classification-models). - For transfer learning use cases, make sure to read the [guide to transfer - learning & fine-tuning](https://keras.io/guides/transfer_learning/). + For transfer learning use cases, make sure to read the + [guide to transfer learning & fine-tuning](https://keras.io/guides/transfer_learning/). Args: - include_rescaling: bool, whether or not to Rescale the inputs. If set + include_rescaling: bool, whether to rescale the inputs. If set to `True`, inputs will be passed through a `Rescaling(1/255.0)` layer. - include_top: bool, whether to include the fully-connected layer at - the top of the network. If provided, `num_classes` must be provided. - num_classes: optional int, number of classes to classify images into (only - to be specified if `include_top` is `True`). + include_top: bool, whether to include the fully-connected layer at the + top of the network. If provided, `num_classes` must be provided. + num_classes: optional int, number of classes to classify images into + (only to be specified if `include_top` is `True`). weights: one of `None` (random initialization), a pretrained weight file - path, or a reference to pre-trained weights (e.g. 'imagenet/classification') - (see available pre-trained weights in weights.py) + path, or a reference to pre-trained weights (e.g. + 'imagenet/classification')(see available pre-trained weights in + weights.py) input_shape: optional shape tuple, defaults to (None, None, 3). input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model. pooling: optional pooling mode for feature extraction when `include_top` is `False`. - - `None` means that the output of the model will be the 4D tensor output - of the last convolutional block. - - `avg` means that global average pooling will be applied to the output - of the last convolutional block, and thus the output of the model will - be a 2D tensor. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional block. + - `avg` means that global average pooling will be applied to the + output of the last convolutional block, and thus the output of + the model will be a 2D tensor. - `max` means that global max pooling will be applied. - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. - name: (Optional) name to pass to the model. Defaults to "{name}". + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. + name: (Optional) name to pass to the model, defaults to "{name}". Returns: A `keras.Model` instance. -""" +""" # noqa: E501 BN_AXIS = 3 @@ -297,10 +297,11 @@ def apply_efficient_net_lite_block( @keras.utils.register_keras_serializable(package="keras_cv.models") class EfficientNetLite(keras.Model): - """Instantiates the EfficientNetLite architecture using given scaling coefficients. + """Instantiates the EfficientNetLite architecture using given scaling + coefficients. Args: - include_rescaling: whether to Rescale the inputs. If set to True, + include_rescaling: whether to rescale the inputs. If set to True, inputs will be passed through a `Rescaling(1/255.0)` layer. include_top: whether to include the fully-connected layer at the top of the network. @@ -333,9 +334,10 @@ class EfficientNetLite(keras.Model): num_classes: optional number of classes to classify images into, only to be specified if `include_top` is True, and if no `weights` argument is specified. - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. Returns: A `keras.Model` instance. @@ -344,8 +346,8 @@ class EfficientNetLite(keras.Model): ValueError: if `blocks_args` is invalid. ValueError: in case of invalid argument for `weights`, or invalid input shape. - ValueError: if `classifier_activation` is not `softmax` or `None` when - using a pretrained top layer. + ValueError: if `classifier_activation` is not `softmax` or `None` + when using a pretrained top layer. """ def __init__( diff --git a/keras_cv/models/efficientnet_v1.py b/keras_cv/models/efficientnet_v1.py index c66d1d3ecc..0be455206b 100644 --- a/keras_cv/models/efficientnet_v1.py +++ b/keras_cv/models/efficientnet_v1.py @@ -16,10 +16,10 @@ """EfficientNet models for Keras. Reference: - - [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks]( - https://arxiv.org/abs/1905.11946) (ICML 2019) + - [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks](https://arxiv.org/abs/1905.11946) + (ICML 2019) - [Based on the original keras.applications EfficientNet](https://github.com/keras-team/keras/blob/master/keras/applications/efficientnet.py) -""" +""" # noqa: E501 import copy import math @@ -126,54 +126,48 @@ BASE_DOCSTRING = """Instantiates the {name} architecture. Reference: - - [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks]( - https://arxiv.org/abs/1905.11946) (ICML 2019) + - [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks](https://arxiv.org/abs/1905.11946) + (ICML 2019) This class represents a Keras image classification model. For image classification use cases, see - [this page for detailed examples]( - https://keras.io/api/applications/#usage-examples-for-image-classification-models). + [this page for detailed examples](https://keras.io/api/applications/#usage-examples-for-image-classification-models). For transfer learning use cases, make sure to read the - [guide to transfer learning & fine-tuning]( - https://keras.io/guides/transfer_learning/). + [guide to transfer learning & fine-tuning](https://keras.io/guides/transfer_learning/). Args: - include_rescaling: bool, whether or not to Rescale the inputs.If set to True, - inputs will be passed through a `Rescaling(1/255.0)` layer. - include_top: bool, Whether to include the fully-connected - layer at the top of the network. - weights: One of `None` (random initialization), - or the path to the weights file to be loaded. - input_shape: tuple, Optional shape tuple. - It should have exactly 3 inputs channels. - input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) - to use as image input for the model. - pooling: Optional pooling mode for feature extraction - when `include_top` is `False`. Defaults to None. - - `None` means that the output of the model will be - the 4D tensor output of the - last convolutional layer. - - `avg` means that global average pooling - will be applied to the output of the - last convolutional layer, and thus - the output of the model will be a 2D tensor. - - `max` means that global max pooling will - be applied. - num_classes: int, Optional number of classes to classify images - into, only to be specified if `include_top` is True, and - if no `weights` argument is specified. Defaults to None. - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. - Defaults to 'softmax'. - When loading pretrained weights, `classifier_activation` can only - be `None` or `"softmax"`. + include_rescaling: bool, whether to rescale the inputs. If set to + True, inputs will be passed through a `Rescaling(1/255.0)` layer. + include_top: bool, Whether to include the fully-connected layer at the + top of the network. + weights: One of `None` (random initialization), or the path to the + weights file to be loaded. + input_shape: tuple, Optional shape tuple. It should have exactly 3 + inputs channels. + input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to + use as image input for the model. + pooling: Optional pooling mode for feature extraction when `include_top` + is `False`, defaults to None. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional layer. + - `avg` means that global average pooling will be applied to the + output of the last convolutional layer, and thus the output of + the model will be a 2D tensor. + - `max` means that global max pooling will be applied. + num_classes: int, Optional number of classes to classify images into, + only to be specified if `include_top` is True, and if no `weights` + argument is specified, defaults to None. + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. Defaults to 'softmax'. When loading pretrained weights, + `classifier_activation` can only be `None` or `"softmax"`. Returns: A `keras.Model` instance. -""" +""" # noqa: E501 BN_AXIS = 3 @@ -215,23 +209,26 @@ def apply_conv_bn( name="", ): """ - Represents Convolutional Block with optional Batch Normalization layer and activation layer + Represents Convolutional Block with optional Batch Normalization layer and + activation layer Args: x: Tensor conv_type: str, Type of Conv layer to be used in block. - 'normal': The Conv2D layer will be used. - 'depth': The DepthWiseConv2D layer will be used. - filters: int, The filter size of the Conv layer. - It should be `None` when `conv_type` is set as `depth` + filters: int, The filter size of the Conv layer. It should be `None` + when `conv_type` is set as `depth` kernel_size: int (or) tuple, The kernel size of the Conv layer. strides: int (or) tuple, The stride value of Conv layer. padding: str (or) callable, The type of padding for Conv layer. use_bias: bool, Boolean to use bias for Conv layer. - kernel_initializer: dict (or) str (or) callable, The kernel initializer for Conv layer. + kernel_initializer: dict (or) str (or) callable, The kernel initializer + for Conv layer. bn_norm: bool, Boolean to add BatchNormalization layer after Conv layer. - activation: str (or) callable, Activation to be applied on the output at end. - name: name of the block + activation: str (or) callable, Activation to be applied on the output at + the end. + name: str, name of the block Returns: tf.Tensor @@ -239,7 +236,8 @@ def apply_conv_bn( if conv_type == "normal": if filters is None or kernel_size is None: raise ValueError( - "The filter size and kernel size should be set for Conv2D layer." + "The filter size and kernel size should be set for Conv2D " + "layer." ) x = layers.Conv2D( filters, @@ -257,7 +255,8 @@ def apply_conv_bn( ) if kernel_size is None or strides is None: raise ValueError( - "The kernel size and strides should be set for DepthWiseConv2D layer." + "The kernel size and strides should be set for DepthWiseConv2D " + "layer." ) x = layers.DepthwiseConv2D( kernel_size, @@ -269,7 +268,8 @@ def apply_conv_bn( )(x) else: raise ValueError( - "The 'conv_type' parameter should be set either to 'normal' or 'depth'" + "The 'conv_type' parameter should be set either to 'normal' or " + "'depth'" ) if bn_norm: @@ -409,10 +409,10 @@ def apply_efficientnet_block( class EfficientNet(keras.Model): """This class represents a Keras EfficientNet architecture. Args: - include_rescaling: bool, whether or not to Rescale the inputs.If set to True, - inputs will be passed through a `Rescaling(1/255.0)` layer. - include_top: bool, whether to include the fully-connected - layer at the top of the network. + include_rescaling: bool, whether to rescale the inputs. If set to + True, inputs will be passed through a `Rescaling(1/255.0)` layer. + include_top: bool, whether to include the fully-connected layer at the + top of the network. width_coefficient: float, scaling coefficient for network width. depth_coefficient: float, scaling coefficient for network depth. default_size: integer, default input image size. @@ -422,34 +422,32 @@ class EfficientNet(keras.Model): activation: activation function. blocks_args: list of dicts, parameters to construct block modules. model_name: string, model name. - weights: one of `None` (random initialization), - or the path to the weights file to be loaded. - input_shape: optional shape tuple, - It should have exactly 3 inputs channels. - input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) - to use as image input for the model. - pooling: optional pooling mode for feature extraction - when `include_top` is `False`. - - `None` means that the output of the model will be - the 4D tensor output of the - last convolutional layer. - - `avg` means that global average pooling - will be applied to the output of the - last convolutional layer, and thus - the output of the model will be a 2D tensor. - - `max` means that global max pooling will - be applied. - num_classes: optional number of classes to classify images - into, only to be specified if `include_top` is True, and - if no `weights` argument is specified. - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. + weights: one of `None` (random initialization), or the path to the + weights file to be loaded. + input_shape: optional shape tuple, it should have exactly 3 input + channels. + input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to + use as image input for the model. + pooling: optional pooling mode for feature extraction when `include_top` + is `False`. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional layer. + - `avg` means that global average pooling will be applied to the + output of the last convolutional layer, and thus the output of + the model will be a 2D tensor. + - `max` means that global max pooling will be applied. + num_classes: optional number of classes to classify images into, + only to be specified if `include_top` is True, and if no `weights` + argument is specified. + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. Returns: A `keras.Model` instance. Raises: - ValueError: in case of invalid argument for `weights`, - or invalid input shape. + ValueError: in case of invalid argument for `weights`, or invalid input + shape. ValueError: if `classifier_activation` is not `softmax` or `None` when using a pretrained top layer. """ @@ -482,8 +480,9 @@ def __init__( if weights and not tf.io.gfile.exists(weights): raise ValueError( - "The `weights` argument should be either `None` or the path to the " - "weights file to be loaded. Weights file not found at location: {weights}" + "The `weights` argument should be either `None` or the path to " + "the weights file to be loaded. Weights file not found at " + f"location: {weights}" ) if include_top and not num_classes: @@ -625,7 +624,8 @@ def round_filters(filters, width_coefficient, divisor): """Round number of filters based on depth multiplier. Args: filters: int, number of filters for Conv layer - width_coefficient: float, denotes the scaling coefficient of network width + width_coefficient: float, denotes the scaling coefficient of network + width divisor: int, a unit of network width Returns: @@ -644,8 +644,9 @@ def round_filters(filters, width_coefficient, divisor): def round_repeats(repeats, depth_coefficient): """Round number of repeats based on depth multiplier. Args: - repeats: int, number of repeats of efficentnet block - depth_coefficient: float, denotes the scaling coefficient of network depth + repeats: int, number of repeats of efficientnet block + depth_coefficient: float, denotes the scaling coefficient of network + depth Returns: int, rounded repeats diff --git a/keras_cv/models/efficientnet_v2.py b/keras_cv/models/efficientnet_v2.py index 114f443edf..f5a4789742 100644 --- a/keras_cv/models/efficientnet_v2.py +++ b/keras_cv/models/efficientnet_v2.py @@ -16,10 +16,9 @@ """EfficientNet V2 models for KerasCV. Reference: - - [EfficientNetV2: Smaller Models and Faster Training]( - https://arxiv.org/abs/2104.00298) (ICML 2021) + - [EfficientNetV2: Smaller Models and Faster Training](https://arxiv.org/abs/2104.00298) (ICML 2021) - [Based on the original keras.applications EfficientNetV2](https://github.com/keras-team/keras/blob/master/keras/applications/efficientnet_v2.py) -""" +""" # noqa: E501 import copy import math @@ -513,47 +512,47 @@ BASE_DOCSTRING = """Instantiates the {name} architecture. Reference: - - [EfficientNetV2: Smaller Models and Faster Training]( - https://arxiv.org/abs/2104.00298) (ICML 2021) + - [EfficientNetV2: Smaller Models and Faster Training](https://arxiv.org/abs/2104.00298) + (ICML 2021) This function returns a Keras image classification model. For image classification use cases, see - [this page for detailed examples]( - https://keras.io/api/applications/#usage-examples-for-image-classification-models). + [this page for detailed examples](https://keras.io/api/applications/#usage-examples-for-image-classification-models). For transfer learning use cases, make sure to read the - [guide to transfer learning & fine-tuning]( - https://keras.io/guides/transfer_learning/). + [guide to transfer learning & fine-tuning](https://keras.io/guides/transfer_learning/). Args: - include_rescaling: bool, whether or not to Rescale the inputs. If set + include_rescaling: bool, whether to rescale the inputs. If set to `True`, inputs will be passed through a `Rescaling(1/255.0)` layer. - include_top: bool, whether to include the fully-connected layer at - the top of the network. If provided, `num_classes` must be provided. - num_classes: optional int, number of classes to classify images into (only - to be specified if `include_top` is `True`). + include_top: bool, whether to include the fully-connected layer at the + top of the network. If provided, `num_classes` must be provided. + num_classes: optional int, number of classes to classify images into + (only to be specified if `include_top` is `True`). weights: one of `None` (random initialization), a pretrained weight file - path, or a reference to pre-trained weights (e.g. 'imagenet/classification') - (see available pre-trained weights in weights.py) + path, or a reference to pre-trained weights (e.g. + 'imagenet/classification') (see available pre-trained weights in + weights.py) input_shape: optional shape tuple, defaults to (None, None, 3). input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model. - pooling: optional pooling mode for feature extraction - when `include_top` is `False`. - - `None` means that the output of the model will be the 4D tensor output - of the last convolutional block. - - `avg` means that global average pooling will be applied to the output - of the last convolutional block, and thus the output of the model will - be a 2D tensor. + pooling: optional pooling mode for feature extraction when `include_top` + is `False`. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional block. + - `avg` means that global average pooling will be applied to the + output of the last convolutional block, and thus the output of + the model will be a 2D tensor. - `max` means that global max pooling will be applied. - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. Returns: A `keras.Model` instance. -""" +""" # noqa: E501 def round_filters(filters, width_coefficient, min_depth, depth_divisor): @@ -577,7 +576,7 @@ class EfficientNetV2(keras.Model): """Instantiates the EfficientNetV2 architecture using given scaling coefficients. Args: - include_rescaling: bool, whether or not to Rescale the inputs. If set + include_rescaling: bool, whether to rescale the inputs. If set to `True`, inputs will be passed through a `Rescaling(1/255.0)` layer. include_top: bool, whether to include the fully-connected @@ -655,8 +654,9 @@ def __init__( if weights and not tf.io.gfile.exists(weights): raise ValueError( - "The `weights` argument should be either `None` or the path to the " - "weights file to be loaded. Weights file not found at location: {weights}" + "The `weights` argument should be either `None` or the path to " + "the weights file to be loaded. Weights file not found at " + f"location: {weights}" ) if include_top and not num_classes: diff --git a/keras_cv/models/mlp_mixer.py b/keras_cv/models/mlp_mixer.py index 5686de0672..b97a007079 100644 --- a/keras_cv/models/mlp_mixer.py +++ b/keras_cv/models/mlp_mixer.py @@ -16,7 +16,7 @@ Reference: - [MLP-Mixer: An all-MLP Architecture for Vision](https://arxiv.org/abs/2105.01601) -""" +""" # noqa: E501 import tensorflow as tf from tensorflow import keras @@ -60,34 +60,36 @@ learning & fine-tuning](https://keras.io/guides/transfer_learning/). Args: - include_rescaling: bool, whether or not to rescale the inputs. If set to True, - inputs will be passed through a `Rescaling(1/255.0)` layer. - include_top: bool, whether to include the fully-connected layer at the top of the - network. If provided, num_classes must be provided. - num_classes: integer, optional number of classes to classify images into. Only to be - specified if `include_top` is True. + include_rescaling: bool, whether to rescale the inputs. If set to + True, inputs will be passed through a `Rescaling(1/255.0)` layer. + include_top: bool, whether to include the fully-connected layer at the + top of the network. If provided, num_classes must be provided. + num_classes: integer, optional number of classes to classify images + into. Only to be specified if `include_top` is True. weights: one of `None` (random initialization), a pretrained weight file - path, or a reference to pre-trained weights (e.g. 'imagenet/classification') - (see available pre-trained weights in weights.py) + path, or a reference to pre-trained weights (e.g. + 'imagenet/classification')(see available pre-trained weights in + weights.py) input_shape: optional shape tuple, defaults to (None, None, 3). input_tensor: optional Keras tensor (i.e., output of `layers.Input()`) to use as image input for the model. pooling: optional pooling mode for feature extraction when `include_top` is `False`. - - `None` means that the output of the model will be the 4D tensor output - of the last convolutional block. - - `avg` means that global average pooling will be applied to the output - of the last convolutional block, and thus the output of the model will - be a 2D tensor. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional block. + - `avg` means that global average pooling will be applied to the + output of the last convolutional block, and thus the output of + the model will be a 2D tensor. - `max` means that global max pooling will be applied. name: string, optional name to pass to the model, defaults to "{name}". - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. Returns: A `keras.Model` instance. -""" +""" # noqa: E501 def apply_mlp_block(x, mlp_dim, name=None): @@ -149,13 +151,17 @@ class MLPMixer(keras.Model): patch_size: integer denoting the size of the patches to be extracted from the inputs (16 for extracting 16x16 patches for example). num_blocks: integer, number of mixer blocks. - hidden_dim: integer, dimension to which the patches will be linearly projected. - tokens_mlp_dim: integer, dimension of the MLP block responsible for tokens. - channels_mlp_dim: integer, dimension of the MLP block responsible for channels. + hidden_dim: integer, dimension to which the patches will be linearly + projected. + tokens_mlp_dim: integer, dimension of the MLP block responsible for + tokens. + channels_mlp_dim: integer, dimension of the MLP block responsible for + channels. include_rescaling: whether to rescale the inputs. If set to True, inputs will be passed through a `Rescaling(1/255.0)` layer. include_top: bool, whether to include the fully-connected - layer at the top of the network. If provided, num_classes must be provided. + layer at the top of the network. If provided, num_classes must be + provided. num_classes: integer, optional number of classes to classify images into. Only to be specified if `include_top` is True. weights: one of `None` (random initialization) or a pretrained diff --git a/keras_cv/models/mobilenet_v3.py b/keras_cv/models/mobilenet_v3.py index 14eeaf7057..4ff4958781 100644 --- a/keras_cv/models/mobilenet_v3.py +++ b/keras_cv/models/mobilenet_v3.py @@ -15,9 +15,9 @@ """MobileNet v3 models for KerasCV. References: - - [Searching for MobileNetV3](https://arxiv.org/pdf/1905.02244.pdf) (ICCV 2019) + - [Searching for MobileNetV3](https://arxiv.org/pdf/1905.02244.pdf)(ICCV 2019) - [Based on the original keras.applications MobileNetv3](https://github.com/keras-team/keras/blob/master/keras/applications/mobilenet_v3.py) -""" +""" # noqa: E501 import tensorflow as tf from tensorflow import keras @@ -42,26 +42,26 @@ learning & fine-tuning](https://keras.io/guides/transfer_learning/). Args: - include_rescaling: bool, whether or not to rescale the inputs. If set to True, - inputs will be passed through a `Rescaling(scale=1 / 255)` + include_rescaling: bool, whether to rescale the inputs. If set to + True, inputs will be passed through a `Rescaling(scale=1 / 255) layer. Defaults to True. - include_top: bool, whether to include the fully-connected layer at the top of the - network. If provided, `num_classes` must be provided. - num_classes: integer, optional number of classes to classify images into. Only to be - specified if `include_top` is True, and if no `weights` argument is - specified. - weights: one of `None` (random initialization) or a pretrained weight file - path. + include_top: bool, whether to include the fully-connected layer at the + top of the network. If provided, `num_classes` must be provided. + num_classes: integer, optional number of classes to classify images + into. Only to be specified if `include_top` is True, and if no + `weights` argument is specified. + weights: one of `None` (random initialization) or a pretrained weight + file path. input_shape: optional shape tuple, defaults to (None, None, 3). input_tensor: optional Keras tensor (i.e., output of `layers.Input()`) to use as image input for the model. pooling: optional pooling mode for feature extraction when `include_top` is `False`. - - `None` means that the output of the model will be the 4D tensor output - of the last convolutional block. - - `avg` means that global average pooling will be applied to the output - of the last convolutional block, and thus the output of the model will - be a 2D tensor. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional block. + - `avg` means that global average pooling will be applied to the + output of the last convolutional block, and thus the output of + the model will be a 2D tensor. - `max` means that global max pooling will be applied. alpha: float, controls the width of the network. This is known as the depth multiplier in the MobileNetV3 paper, but the name is kept for @@ -74,27 +74,29 @@ are used at each layer. minimalistic: in addition to large and small models, this module also contains so-called minimalistic models; these models have the same - per-layer dimensions characteristic as MobilenetV3 however, they don't - utilize any of the advanced blocks (squeeze-and-excite units, hard-swish, - and 5x5 convolutions). While these models are less efficient on CPU, they - are much more performant on GPU/DSP. - dropout_rate: a float between 0 and 1 denoting the fraction of input units to - drop, defaults to 0.2. - classifier_activation: the activation function to use, defaults to softmax. + per-layer dimensions characteristic as MobilenetV3 however, they + don't utilize any of the advanced blocks (squeeze-and-excite units, + hard-swish, and 5x5 convolutions). While these models are less + efficient on CPU, they are much more performant on GPU/DSP. + dropout_rate: a float between 0 and 1 denoting the fraction of input + units to drop, defaults to 0.2. + classifier_activation: the activation function to use, defaults to + softmax. name: string, optional name to pass to the model, defaults to "{name}". Returns: A `keras.Model` instance. -""" +""" # noqa: E501 def depth(x, divisor=8, min_value=None): - """Ensure that all layers have a channel number that is divisible by the `divisor`. + """Ensure that all layers have a channel number that is divisible by the + `divisor`. Args: x: integer, input value. - divisor: integer, the value by which a channel number should be divisible, - defaults to 8. + divisor: integer, the value by which a channel number should be + divisible, defaults to 8. min_value: float, minimum value for the new tensor. Returns: @@ -163,8 +165,8 @@ def apply_inverted_res_block( Args: x: input tensor. - expansion: integer, the expansion ratio, multiplied with infilters to get the - minimum value passed to depth. + expansion: integer, the expansion ratio, multiplied with infilters to + get the minimum value passed to depth. filters: integer, number of filters for convolution layer. kernel_size: integer, the kernel size for DepthWise Convolutions. stride: integer, the stride length for DepthWise Convolutions. @@ -252,7 +254,7 @@ class MobileNetV3(keras.Model): """Instantiates the MobileNetV3 architecture. References: - - [Searching for MobileNetV3](https://arxiv.org/pdf/1905.02244.pdf) (ICCV 2019) + - [Searching for MobileNetV3](https://arxiv.org/pdf/1905.02244.pdf)(ICCV 2019) - [Based on the Original keras.applications MobileNetv3](https://github.com/keras-team/keras/blob/master/keras/applications/mobilenet_v3.py) This class represents a Keras MobileNetV3 model. @@ -267,23 +269,23 @@ class MobileNetV3(keras.Model): include_rescaling: bool, whether to rescale the inputs. If set to True, inputs will be passed through a `Rescaling(scale=1 / 255)` layer. - include_top: bool, whether to include the fully-connected layer at the top of the - network. If provided, `num_classes` must be provided. - num_classes: optional number of classes to classify images into. Only to be - specified if `include_top` is True, and if no `weights` argument is - specified. - weights: one of `None` (random initialization) or a pre-trained weight file - path. + include_top: bool, whether to include the fully-connected layer at the + top of the network. If provided, `num_classes` must be provided. + num_classes: optional number of classes to classify images into. Only to + be specified if `include_top` is True, and if no `weights` argument + is specified. + weights: one of `None` (random initialization) or a pre-trained weight + file path. input_shape: optional shape tuple, defaults to (None, None, 3). input_tensor: optional Keras tensor (i.e., output of `layers.Input()`) to use as image input for the model. pooling: optional pooling mode for feature extraction when `include_top` is `False`. - - `None` means that the output of the model will be the 4D tensor output - of the last convolutional block. - - `avg` means that global average pooling will be applied to the output - of the last convolutional block, and thus the output of the model will - be a 2D tensor. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional block. + - `avg` means that global average pooling will be applied to the + output of the last convolutional block, and thus the output of + the model will be a 2D tensor. - `max` means that global max pooling will be applied. alpha: float, controls the width of the network. This is known as the depth multiplier in the MobileNetV3 paper, but the name is kept for @@ -296,24 +298,26 @@ class MobileNetV3(keras.Model): are used at each layer. minimalistic: in addition to large and small models, this module also contains so-called minimalistic models; these models have the same - per-layer dimensions characteristic as MobilenetV3 however, they don't - utilize any of the advanced blocks (squeeze-and-excite units, hard-swish, - and 5x5 convolutions). While these models are less efficient on CPU, they - are much more performant on GPU/DSP. - dropout_rate: a float between 0 and 1 denoting the fraction of input units to - drop, defaults to 0.2. - classifier_activation: the activation function to use, defaults to softmax. - name: string, optional name to pass to the model, defaults to "MobileNetV3". + per-layer dimensions characteristic as MobilenetV3 however, they + don't utilize any of the advanced blocks (squeeze-and-excite units, + hard-swish, and 5x5 convolutions). While these models are less + efficient on CPU, they are much more performant on GPU/DSP. + dropout_rate: a float between 0 and 1 denoting the fraction of input + units to drop, defaults to 0.2. + classifier_activation: the activation function to use, defaults to + softmax. + name: string, optional name to pass to the model, defaults to + "MobileNetV3". **kwargs: Pass-through keyword arguments to `keras.Model`. Returns: A `keras.Model` instance. Raises: - ValueError: if `weights` represents an invalid path to the weights file and is not - None. + ValueError: if `weights` represents an invalid path to the weights file + and is not None. ValueError: if `include_top` is True and `num_classes` is not specified. - """ + """ # noqa: E501 def __init__( self, diff --git a/keras_cv/models/object_detection/__internal__.py b/keras_cv/models/object_detection/__internal__.py index 243638e4f2..5b3ddf912e 100644 --- a/keras_cv/models/object_detection/__internal__.py +++ b/keras_cv/models/object_detection/__internal__.py @@ -47,8 +47,9 @@ def convert_inputs_to_tf_dataset( if isinstance(x, tf.data.Dataset): if y is not None or batch_size is not None: raise ValueError( - "When `x` is a `tf.data.Dataset`, please do not provide a value for " - f"`y` or `batch_size`. Got `y={y}`, `batch_size={batch_size}`." + "When `x` is a `tf.data.Dataset`, please do not provide a " + f"value for `y` or `batch_size`. Got `y={y}`, " + f"`batch_size={batch_size}`." ) return x diff --git a/keras_cv/models/object_detection/__test_utils__.py b/keras_cv/models/object_detection/__test_utils__.py index 489bacd7dc..54f99dc5ce 100644 --- a/keras_cv/models/object_detection/__test_utils__.py +++ b/keras_cv/models/object_detection/__test_utils__.py @@ -19,8 +19,8 @@ def _create_bounding_box_dataset( bounding_box_format, use_dictionary_box_format=False ): - # Just about the easiest dataset you can have, all classes are 0, all boxes are - # exactly the same. [1, 1, 2, 2] are the coordinates in xyxy + # Just about the easiest dataset you can have, all classes are 0, all boxes + # are exactly the same. [1, 1, 2, 2] are the coordinates in xyxy. xs = tf.random.normal(shape=(1, 512, 512, 3), dtype=tf.float32) xs = tf.tile(xs, [5, 1, 1, 1]) diff --git a/keras_cv/models/object_detection/faster_rcnn/faster_rcnn.py b/keras_cv/models/object_detection/faster_rcnn/faster_rcnn.py index 2bff8391e4..13f3804c60 100644 --- a/keras_cv/models/object_detection/faster_rcnn/faster_rcnn.py +++ b/keras_cv/models/object_detection/faster_rcnn/faster_rcnn.py @@ -225,7 +225,7 @@ def get_config(self): class FasterRCNN(keras.Model): """A Keras model implementing the FasterRCNN architecture. - Implements the FasterRCNN architecture for object detection. The constructor + Implements the FasterRCNN architecture for object detection. The constructor requires `num_classes`, `bounding_box_format` and a `backbone`. References: @@ -241,34 +241,37 @@ class FasterRCNN(keras.Model): ``` Args: - num_classes: the number of classes in your dataset excluding the background - class. classes should be represented by integers in the range - [0, num_classes). + num_classes: the number of classes in your dataset excluding the + background class. classes should be represented by integers in the + range [0, num_classes). bounding_box_format: The format of bounding boxes of model output. Refer [to the keras.io docs](https://keras.io/api/keras_cv/bounding_box/formats/) for more details on supported bounding box formats. - backbone: Optional `keras.Model`. Must implement the `pyramid_level_inputs` - property with keys 2, 3, 4, and 5 and layer names as values. If - `None`, defaults to `keras_cv.models.ResNet50Backbone()`. - anchor_generator: (Optional) a `keras_cv.layers.AnchorGeneratot`. It is used - in the model to match ground truth boxes and labels with anchors, or with - region proposals. By default it uses the sizes and ratios from the paper, - that is optimized for image size between [640, 800]. The users should pass - their own anchor generator if the input image size differs from paper. - For now, only anchor generator with per level dict output is supported, - label_encoder: (Optional) a keras.Layer that accepts an anchors Tensor, a - bounding box Tensor and a bounding box class Tensor to its `call()` method, - and returns RetinaNet training targets. It returns box and class targets as - well as sample weights. - rcnn_head: (Optional) a `keras.layers.Layer` that takes input feature map and - returns a box delta prediction (in reference to rois) and multi-class prediction - (all foreground classes + one background class). By default it uses the rcnn head - from paper, which is 2 FC layer with 1024 dimension, 1 box regressor and 1 - softmax classifier. - prediction_decoder: (Optional) a `keras.layers.Layer` that takes input box prediction and - softmaxed score prediction, and returns NMSed box prediction, NMSed softmaxed - score prediction, NMSed class prediction, and NMSed valid detection. - """ + backbone: Optional `keras.Model`. Must implement the + `pyramid_level_inputs` property with keys 2, 3, 4, and 5 and layer + names as values. If `None`, defaults to + `keras_cv.models.ResNet50Backbone()`. + anchor_generator: (Optional) a `keras_cv.layers.AnchorGeneratot`. It is + used in the model to match ground truth boxes and labels with + anchors, or with region proposals. By default it uses the sizes and + ratios from the paper, that is optimized for image size between + [640, 800]. The users should pass their own anchor generator if the + input image size differs from paper. For now, only anchor generator + with per level dict output is supported, + label_encoder: (Optional) a keras.Layer that accepts an anchors Tensor, + a bounding box Tensor and a bounding box class Tensor to its + `call()` method, and returns RetinaNet training targets. It returns + box and class targets as well as sample weights. + rcnn_head: (Optional) a `keras.layers.Layer` that takes input feature + map and returns a box delta prediction (in reference to rois) and + multi-class prediction (all foreground classes + one background + class). By default it uses the rcnn head from paper, which is 2 FC + layer with 1024 dimension, 1 box regressor and 1 softmax classifier. + prediction_decoder: (Optional) a `keras.layers.Layer` that takes input + box prediction and softmaxed score prediction, and returns NMSed box + prediction, NMSed softmaxed score prediction, NMSed class + prediction, and NMSed valid detection. + """ # noqa: E501 def __init__( self, @@ -598,11 +601,13 @@ def _validate_and_get_loss(loss, loss_name): loss = keras.losses.get(loss) if loss is None or not isinstance(loss, keras.losses.Loss): raise ValueError( - f"FasterRCNN only accepts `keras.losses.Loss` for {loss_name}, got {loss}" + f"FasterRCNN only accepts `keras.losses.Loss` for {loss_name}, " + f"got {loss}" ) if loss.reduction != keras.losses.Reduction.SUM: logging.info( - f"FasterRCNN only accepts `SUM` reduction, got {loss.reduction}, automatically converted." + f"FasterRCNN only accepts `SUM` reduction, got {loss.reduction}, " + "automatically converted." ) loss.reduction = keras.losses.Reduction.SUM return loss diff --git a/keras_cv/models/object_detection/faster_rcnn/faster_rcnn_test.py b/keras_cv/models/object_detection/faster_rcnn/faster_rcnn_test.py index 3f998daffe..ab52bd8e59 100644 --- a/keras_cv/models/object_detection/faster_rcnn/faster_rcnn_test.py +++ b/keras_cv/models/object_detection/faster_rcnn/faster_rcnn_test.py @@ -80,7 +80,7 @@ def test_invalid_compile(self): @pytest.mark.skipif( "INTEGRATION" not in os.environ or os.environ["INTEGRATION"] != "true", reason="Takes a long time to run, only runs when INTEGRATION " - "environment variable is set. To run the test please run: \n" + "environment variable is set. To run the test please run: \n" "`INTEGRATION=true pytest keras_cv/", ) def test_faster_rcnn_with_dictionary_input_format(self): diff --git a/keras_cv/models/object_detection/retina_net/__init__.py b/keras_cv/models/object_detection/retina_net/__init__.py index 51602a02c1..17e0a01010 100644 --- a/keras_cv/models/object_detection/retina_net/__init__.py +++ b/keras_cv/models/object_detection/retina_net/__init__.py @@ -17,6 +17,6 @@ from keras_cv.models.object_detection.retina_net.prediction_head import ( PredictionHead, ) -from keras_cv.models.object_detection.retina_net.retina_net_label_encoder import ( +from keras_cv.models.object_detection.retina_net.retina_net_label_encoder import ( # noqa: E501 RetinaNetLabelEncoder, ) diff --git a/keras_cv/models/object_detection/retina_net/retina_net.py b/keras_cv/models/object_detection/retina_net/retina_net.py index 19a7f5981d..5a72c34a40 100644 --- a/keras_cv/models/object_detection/retina_net/retina_net.py +++ b/keras_cv/models/object_detection/retina_net/retina_net.py @@ -41,14 +41,14 @@ BOX_VARIANCE = [0.1, 0.1, 0.2, 0.2] -# TODO(jbischof): Generalize `FeaturePyramid` class to allow for any P-levels and -# add `feature_pyramid_levels` param. +# TODO(jbischof): Generalize `FeaturePyramid` class to allow for any P-levels +# and add `feature_pyramid_levels` param. @keras.utils.register_keras_serializable(package="keras_cv") class RetinaNet(Task): """A Keras model implementing the RetinaNet meta-architecture. - Implements the RetinaNet architecture for object detection. The constructor - requires `num_classes`, `bounding_box_format`, and a backbone. Optionally, + Implements the RetinaNet architecture for object detection. The constructor + requires `num_classes`, `bounding_box_format`, and a backbone. Optionally, a custom label encoder, and prediction decoder may be provided. Examples: @@ -86,43 +86,46 @@ class RetinaNet(Task): ``` Args: - num_classes: the number of classes in your dataset excluding the background - class. classes should be represented by integers in the range - [0, num_classes). - bounding_box_format: The format of bounding boxes of input dataset. Refer + num_classes: the number of classes in your dataset excluding the + background class. Classes should be represented by integers in the + range [0, num_classes). + bounding_box_format: The format of bounding boxes of input dataset. + Refer [to the keras.io docs](https://keras.io/api/keras_cv/bounding_box/formats/) for more details on supported bounding box formats. backbone: `keras.Model`. Must implement the `pyramid_level_inputs` property with keys 3, 4, and 5 and layer names as values. A somewhat sensible backbone to use in many cases is the: `keras_cv.models.ResNetBackbone.from_preset("resnet50_imagenet")` - anchor_generator: (Optional) a `keras_cv.layers.AnchorGenerator`. If provided, - the anchor generator will be passed to both the `label_encoder` and the - `prediction_decoder`. Only to be used when both `label_encoder` and - `prediction_decoder` are both `None`. Defaults to an anchor generator with - the parameterization: `strides=[2**i for i in range(3, 8)]`, + anchor_generator: (Optional) a `keras_cv.layers.AnchorGenerator`. If + provided, the anchor generator will be passed to both the + `label_encoder` and the `prediction_decoder`. Only to be used when + both `label_encoder` and `prediction_decoder` are both `None`. + Defaults to an anchor generator with the parameterization: + `strides=[2**i for i in range(3, 8)]`, `scales=[2**x for x in [0, 1 / 3, 2 / 3]]`, `sizes=[32.0, 64.0, 128.0, 256.0, 512.0]`, and `aspect_ratios=[0.5, 1.0, 2.0]`. label_encoder: (Optional) a keras.Layer that accepts an image Tensor, a - bounding box Tensor and a bounding box class Tensor to its `call()` method, - and returns RetinaNet training targets. By default, a KerasCV standard - `RetinaNetLabelEncoder` is created and used. Results of this object's - `call()` method are passed to the `loss` object for `box_loss` and - `classification_loss` the `y_true` argument. - prediction_decoder: (Optional) A `keras.layers.Layer` that is responsible - for transforming RetinaNet predictions into usable bounding box - Tensors. If not provided, a default is provided. The default - `prediction_decoder` layer is a + bounding box Tensor and a bounding box class Tensor to its `call()` + method, and returns RetinaNet training targets. By default, a + KerasCV standard `RetinaNetLabelEncoder` is created and used. + Results of this object's `call()` method are passed to the `loss` + object for `box_loss` and `classification_loss` the `y_true` + argument. + prediction_decoder: (Optional) A `keras.layers.Layer` that is + responsible for transforming RetinaNet predictions into usable + bounding box Tensors. If not provided, a default is provided. The + default `prediction_decoder` layer is a `keras_cv.layers.MultiClassNonMaxSuppression` layer, which uses a Non-Max Suppression for box pruning. - classification_head: (Optional) A `keras.Layer` that performs classification of - the bounding boxes. If not provided, a simple ConvNet with 3 layers will be - used. - box_head: (Optional) A `keras.Layer` that performs regression of - the bounding boxes. If not provided, a simple ConvNet with 3 layers will be - used. - """ + classification_head: (Optional) A `keras.Layer` that performs + classification of the bounding boxes. If not provided, a simple + ConvNet with 3 layers will be used. + box_head: (Optional) A `keras.Layer` that performs regression of the + bounding boxes. If not provided, a simple ConvNet with 3 layers + will be used. + """ # noqa: E501 def __init__( self, @@ -141,13 +144,14 @@ def __init__( ): raise ValueError( "`anchor_generator` is only to be provided when " - "both `label_encoder` and `prediction_decoder` are both `None`. " - f"Received `anchor_generator={anchor_generator}` " + "both `label_encoder` and `prediction_decoder` are both " + f"`None`. Received `anchor_generator={anchor_generator}` " f"`label_encoder={label_encoder}`, " - f"`prediction_decoder={prediction_decoder}`. To customize the behavior of " - "the anchor_generator inside of a custom `label_encoder` or custom " - "`prediction_decoder` you should provide both to `RetinaNet`, and ensure " - "that the `anchor_generator` provided to both is identical" + f"`prediction_decoder={prediction_decoder}`. To customize the " + "behavior of the anchor_generator inside of a custom " + "`label_encoder` or custom `prediction_decoder` you should " + "provide both to `RetinaNet`, and ensure that the " + "`anchor_generator` provided to both is identical" ) anchor_generator = ( anchor_generator @@ -217,9 +221,10 @@ def __init__( if bounding_box_format.lower() != "xywh": raise ValueError( "`keras_cv.models.RetinaNet` only supports the 'xywh' " - "`bounding_box_format`. In future releases, more formats will be " - "supported. For now, please pass `bounding_box_format='xywh'`. " - f"Received `bounding_box_format={bounding_box_format}`" + "`bounding_box_format`. In future releases, more formats will " + "be supported. For now, please pass " + "`bounding_box_format='xywh'`. Received " + f"`bounding_box_format={bounding_box_format}`" ) self.bounding_box_format = bounding_box_format @@ -227,10 +232,10 @@ def __init__( if num_classes == 1: raise ValueError( "RetinaNet must always have at least 2 classes. " - "This is because logits are passed through a `tf.softmax()` call " - "before `MultiClassNonMaxSuppression()` is applied. If only " - "a single class is present, the model will always give a score of " - "`1` for the single present class." + "This is because logits are passed through a `tf.softmax()` " + "call before `MultiClassNonMaxSuppression()` is applied. If " + "only a single class is present, the model will always give a " + "score of `1` for the single present class." ) self.backbone = backbone @@ -328,33 +333,34 @@ def compile( """compiles the RetinaNet. compile() mirrors the standard Keras compile() method, but has a few key - distinctions. Primarily, all metrics must support bounding boxes, and + distinctions. Primarily, all metrics must support bounding boxes, and two losses must be provided: `box_loss` and `classification_loss`. Args: - box_loss: a Keras loss to use for box offset regression. Preconfigured - losses are provided when the string "huber" or "smoothl1" are passed. + box_loss: a Keras loss to use for box offset regression. + Preconfigured losses are provided when the string "huber" or + "smoothl1" are passed. classification_loss: a Keras loss to use for box classification. - A preconfigured `FocalLoss` is provided when the string "focal" is - passed. + A preconfigured `FocalLoss` is provided when the string "focal" + is passed. weight_decay: a float for variable weight decay. metrics: KerasCV object detection metrics that accept decoded - bounding boxes as their inputs. Examples of this metric type are - `keras_cv.metrics.BoxRecall()` and - `keras_cv.metrics.BoxMeanAveragePrecision()`. When `metrics` are + bounding boxes as their inputs. Examples of this metric type + are `keras_cv.metrics.BoxRecall()` and + `keras_cv.metrics.BoxMeanAveragePrecision()`. When `metrics` are included in the call to `compile()`, the RetinaNet will perform - non max suppression decoding during the forward pass. By - default the RetinaNet uses a + non-max suppression decoding during the forward pass. By + default, the RetinaNet uses a `keras_cv.layers.MultiClassNonMaxSuppression()` layer to - perform decoding. This behavior can be customized by passing in a - `prediction_decoder` to the constructor or by modifying the + perform decoding. This behavior can be customized by passing in + a `prediction_decoder` to the constructor or by modifying the `prediction_decoder` attribute on the model. It should be noted - that the default non max suppression operation does not have + that the default non-max suppression operation does not have TPU support, and thus when training on TPU metrics must be evaluated in a `keras.utils.SidecarEvaluator` or a `keras.callbacks.Callback`. - kwargs: most other `keras.Model.compile()` arguments are supported and - propagated to the `keras.Model` class. + kwargs: most other `keras.Model.compile()` arguments are supported + and propagated to the `keras.Model` class. """ if loss is not None: raise ValueError( @@ -377,9 +383,11 @@ def compile( if box_loss.bounding_box_format != self.bounding_box_format: raise ValueError( "Wrong `bounding_box_format` passed to `box_loss` in " - "`RetinaNet.compile()`. " - f"Got `box_loss.bounding_box_format={box_loss.bounding_box_format}`, " - f"want `box_loss.bounding_box_format={self.bounding_box_format}`" + "`RetinaNet.compile()`. Got " + "`box_loss.bounding_box_format=" + f"{box_loss.bounding_box_format}`, want " + "`box_loss.bounding_box_format=" + f"{self.bounding_box_format}`" ) self.box_loss = box_loss @@ -396,21 +404,23 @@ def compile( def compute_loss(self, x, box_pred, cls_pred, boxes, classes): if boxes.shape[-1] != 4: raise ValueError( - "boxes should have shape (None, None, 4). Got " + "boxes should have shape (None, None, 4). Got " f"boxes.shape={tuple(boxes.shape)}" ) if box_pred.shape[-1] != 4: raise ValueError( - "box_pred should have shape (None, None, 4). " - f"Got box_pred.shape={tuple(box_pred.shape)}. Does your model's `num_classes` " - "parameter match your losses `num_classes` parameter?" + "box_pred should have shape (None, None, 4). Got " + f"box_pred.shape={tuple(box_pred.shape)}. Does your model's " + "`num_classes` parameter match your losses `num_classes` " + "parameter?" ) if cls_pred.shape[-1] != self.num_classes: raise ValueError( - "cls_pred should have shape (None, None, 4). " - f"Got cls_pred.shape={tuple(cls_pred.shape)}. Does your model's `num_classes` " - "parameter match your losses `num_classes` parameter?" + "cls_pred should have shape (None, None, 4). Got " + f"cls_pred.shape={tuple(cls_pred.shape)}. Does your model's " + "`num_classes` parameter match your losses `num_classes` " + "parameter?" ) cls_labels = tf.one_hot( @@ -451,7 +461,7 @@ def train_step(self, data): images=x, ) boxes, classes = self.label_encoder(x, y_for_label_encoder) - # boxes are now in `center_yxhw`. This is always the case in training + # boxes are now in `center_yxhw`. This is always the case in training with tf.GradientTape() as tape: outputs = self(x, training=True) box_pred, cls_pred = outputs["box"], outputs["classification"] @@ -537,14 +547,16 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return copy.deepcopy( {**backbone_presets_with_weights, **retina_net_presets} ) @classproperty def backbone_presets(cls): - """Dictionary of preset names and configurations of compatible backbones.""" + """Dictionary of preset names and configurations of compatible + backbones.""" return copy.deepcopy(backbone_presets) @@ -563,7 +575,7 @@ def _parse_box_loss(loss): raise ValueError( "Expected `box_loss` to be either a Keras Loss, " - f"callable, or the string 'SmoothL1'. Got loss={loss}." + f"callable, or the string 'SmoothL1'. Got loss={loss}." ) @@ -580,5 +592,5 @@ def _parse_classification_loss(loss): raise ValueError( "Expected `classification_loss` to be either a Keras Loss, " - f"callable, or the string 'Focal'. Got loss={loss}." + f"callable, or the string 'Focal'. Got loss={loss}." ) diff --git a/keras_cv/models/object_detection/retina_net/retina_net_label_encoder.py b/keras_cv/models/object_detection/retina_net/retina_net_label_encoder.py index 299f30e6f7..d2c8b589e3 100644 --- a/keras_cv/models/object_detection/retina_net/retina_net_label_encoder.py +++ b/keras_cv/models/object_detection/retina_net/retina_net_label_encoder.py @@ -27,26 +27,28 @@ class RetinaNetLabelEncoder(layers.Layer): This class has operations to generate targets for a batch of samples which is made up of the input images, bounding boxes for the objects present and - their class ids. Targets are always represented in `center_yxwh` format. + their class ids. Targets are always represented in `center_yxwh` format. This done for numerical reasons, to ensure numerical consistency when training in any format. Args: - bounding_box_format: The format of bounding boxes of input dataset. Refer - [to the keras.io docs](https://keras.io/api/keras_cv/bounding_box/formats/) - for more details on supported bounding box formats. - anchor_generator: `keras_cv.layers.AnchorGenerator` instance to produce anchor - boxes. Boxes are then used to encode labels on a per-image basis. - positive_threshold: the float threshold to set an anchor to positive match to gt box. - values above it are positive matches. - negative_threshold: the float threshold to set an anchor to negative match to gt box. - values below it are negative matches. - box_variance: The scaling factors used to scale the bounding box targets. - Defaults to (0.1, 0.1, 0.2, 0.2). - background_class: (Optional) The class ID used for the background class. - Defaults to -1. - ignore_class: (Optional) The class ID used for the ignore class. Defaults to -2. - """ + bounding_box_format: The format of bounding boxes of input dataset. + Refer [to the keras.io docs](https://keras.io/api/keras_cv/bounding_box/formats/) for more + details on supported bounding box formats. + anchor_generator: `keras_cv.layers.AnchorGenerator` instance to produce + anchor boxes. Boxes are then used to encode labels on a per-image + basis. + positive_threshold: the float threshold to set an anchor to positive + match to gt box. Values above it are positive matches. + negative_threshold: the float threshold to set an anchor to negative + match to gt box. Values below it are negative matches. + box_variance: The scaling factors used to scale the bounding box + targets, defaults to (0.1, 0.1, 0.2, 0.2). + background_class: (Optional) The class ID used for the background class, + defaults to -1. + ignore_class: (Optional) The class ID used for the ignore class, + defaults to -2. + """ # noqa: E501 def __init__( self, @@ -137,11 +139,11 @@ def _encode_sample(self, box_labels, anchor_boxes): ) label = tf.concat([box_target, cls_target], axis=-1) - # In the case that a box in the corner of an image matches with an all -1 box - # that is outside of the image, we should assign the box to the ignore class - # There are rare cases where a -1 box can be matched, resulting in a NaN during - # training. The unit test passing all -1s to the label encoder ensures that we - # properly handle this edge-case. + # In the case that a box in the corner of an image matches with an all + # -1 box that is outside the image, we should assign the box to the + # ignore class. There are rare cases where a -1 box can be matched, + # resulting in a NaN during training. The unit test passing all -1s to + # the label encoder ensures that we properly handle this edge-case. label = tf.where( tf.expand_dims( tf.math.reduce_any(tf.math.is_nan(label), axis=-1), axis=-1 @@ -177,13 +179,13 @@ def call(self, images, box_labels): Args: images: a batched [batch_size, H, W, C] image float `tf.Tensor`. box_labels: a batched KerasCV style bounding box dictionary containing - bounding boxes and class labels. Should be in `bounding_box_format`. + bounding boxes and class labels. Should be in `bounding_box_format`. """ if isinstance(images, tf.RaggedTensor): raise ValueError( "`RetinaNetLabelEncoder`'s `call()` method does not " - "support RaggedTensor inputs for the `images` argument. Received " - f"`type(images)={type(images)}`." + "support RaggedTensor inputs for the `images` argument. " + f"Received `type(images)={type(images)}`." ) box_labels = bounding_box.to_dense(box_labels) diff --git a/keras_cv/models/object_detection/retina_net/retina_net_presets.py b/keras_cv/models/object_detection/retina_net/retina_net_presets.py index 7806b9c107..4674666f97 100644 --- a/keras_cv/models/object_detection/retina_net/retina_net_presets.py +++ b/keras_cv/models/object_detection/retina_net/retina_net_presets.py @@ -20,7 +20,7 @@ "description": ( "RetinaNet with a ResNet50 v1 backbone. " "Trained on PascalVOC 2012 object detection task, which " - "consists of 20 classes. This model achieves a final MaP of " + "consists of 20 classes. This model achieves a final MaP of " "0.33 on the evaluation set." ), }, @@ -30,7 +30,7 @@ # performance. "num_classes": 21, }, - "weights_url": "https://storage.googleapis.com/keras-cv/models/retinanet/pascal_voc/resnet50.weights.h5", + "weights_url": "https://storage.googleapis.com/keras-cv/models/retinanet/pascal_voc/resnet50.weights.h5", # noqa: E501 "weights_hash": "c9b11357b289512adf1e6077ab7da73f", }, } diff --git a/keras_cv/models/object_detection/retina_net/retina_net_test.py b/keras_cv/models/object_detection/retina_net/retina_net_test.py index e6f8fa74b5..4bb66c8dbd 100644 --- a/keras_cv/models/object_detection/retina_net/retina_net_test.py +++ b/keras_cv/models/object_detection/retina_net/retina_net_test.py @@ -32,7 +32,8 @@ def cleanup_global_session(self): # Code before yield runs before the test tf.config.set_soft_device_placement(False) yield - # Reset soft device placement to not interfere with other unit test files + # Reset soft device placement to not interfere with other unit test + # files tf.config.set_soft_device_placement(True) keras.backend.clear_session() @@ -55,7 +56,7 @@ def test_retina_net_construction(self): @pytest.mark.skipif( "INTEGRATION" not in os.environ or os.environ["INTEGRATION"] != "true", reason="Takes a long time to run, only runs when INTEGRATION " - "environment variable is set. To run the test please run: \n" + "environment variable is set. To run the test please run: \n" "`INTEGRATION=true pytest keras_cv/", ) def test_retina_net_call(self): diff --git a/keras_cv/models/object_detection_3d/center_pillar.py b/keras_cv/models/object_detection_3d/center_pillar.py index b10830dd12..f486bdcd5e 100644 --- a/keras_cv/models/object_detection_3d/center_pillar.py +++ b/keras_cv/models/object_detection_3d/center_pillar.py @@ -143,13 +143,13 @@ class MultiHeadCenterPillar(keras.Model): backbone: the backbone to apply to voxelized features. voxel_net: the voxel_net that takes point cloud feature and convert to voxelized features. - multiclass_head: a multi class head which returns a dict of heatmap prediction - and regression prediction per class. + multiclass_head: a multi class head which returns a dict of heatmap + prediction and regression prediction per class. label_encoder: a LabelEncoder that takes point cloud xyz and point cloud features and returns a multi class labels which is a dict of heatmap, box location and top_k heatmap index per class. - prediction_decoder: a multi class heatmap prediction decoder that returns a dict - of decoded boxes, box class, and box confidence score per class. + prediction_decoder: a multi class heatmap prediction decoder that returns + a dict of decoded boxes, box class, and box confidence score per class. """ diff --git a/keras_cv/models/regnet.py b/keras_cv/models/regnet.py index c3fbb815c0..e6be1eb173 100644 --- a/keras_cv/models/regnet.py +++ b/keras_cv/models/regnet.py @@ -13,10 +13,9 @@ # limitations under the License. """RegNet models for KerasCV. References: - - [Designing Network Design Spaces](https://arxiv.org/abs/2003.13678) - (CVPR 2020) + - [Designing Network Design Spaces](https://arxiv.org/abs/2003.13678) (CVPR 2020) - [Based on the Original keras.applications RegNet](https://github.com/keras-team/keras/blob/master/keras/applications/regnet.py) -""" +""" # noqa: E501 import tensorflow as tf from tensorflow import keras @@ -210,9 +209,11 @@ - [Designing Network Design Spaces](https://arxiv.org/abs/2003.13678) (CVPR 2020) - For image classification use cases, see [this page for detailed examples](https://keras.io/api/applications/#usage-examples-for-image-classification-models). + For image classification use cases, see + [this page for detailed examples](https://keras.io/api/applications/#usage-examples-for-image-classification-models). - For transfer learning use cases, make sure to read the [guide to transfer learning & fine-tuning](https://keras.io/guides/transfer_learning/). + For transfer learning use cases, make sure to read the + [guide to transfer learning & fine-tuning](https://keras.io/guides/transfer_learning/). The naming of models is as follows: `RegNet` where @@ -221,20 +222,20 @@ Y block and 6.4 giga flops (64 hundred million flops). Args: - include_rescaling: whether or not to Rescale the inputs.If set to True, + include_rescaling: whether to rescale the inputs. If set to True, inputs will be passed through a `Rescaling(1/255.0)` layer. include_top: Whether to include the fully-connected layer at the top of the network. num_classes: Optional number of classes to classify images into, only to be specified if `include_top` is True. weights: One of `None` (random initialization), or the path to the weights - file to be loaded. Defaults to `None`. + file to be loaded, defaults to `None`. input_tensor: Optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model. input_shape: Optional shape tuple, defaults to (None, None, 3). It should have exactly 3 inputs channels. pooling: Optional pooling mode for feature extraction - when `include_top` is `False`. Defaults to None. + when `include_top` is `False`, defaults to None. - `None` means that the output of the model will be the 4D tensor output of the last convolutional layer. @@ -253,7 +254,7 @@ Returns: A `keras.Model` instance. -""" +""" # noqa: E501 def apply_conv2d_bn( @@ -673,28 +674,28 @@ class RegNet(keras.Model): depths: iterable, Contains depths for each individual stages. widths: iterable, Contains output channel width of each individual stages - group_width: int, Number of channels to be used in each group. See grouped - convolutions for more information. + group_width: int, Number of channels to be used in each group. See + grouped convolutions for more information. block_type: Must be one of `{"X", "Y", "Z"}`. For more details see the papers "Designing network design spaces" and "Fast and Accurate Model Scaling" - default_size: tuple (or) list, Default input image size. + default_size: tuple (or) list, default input image size. model_name: str, An optional name for the model. - include_rescaling: bool, whether or not to Rescale the inputs.If set to True, - inputs will be passed through a `Rescaling(1/255.0)` layer. + include_rescaling: bool, whether to rescale the inputs. If set to + True, inputs will be passed through a `Rescaling(1/255.0)` layer. include_top: bool, Whether to include the fully-connected layer at the top of the network. num_classes: int, Optional number of classes to classify images into, only to be specified if `include_top` is True, and if no `weights` argument is specified. weights: str, One of `None` (random initialization), or the path to the - weights file to be loaded. Defaults to `None`. - input_tensor: Tensor, Optional Keras tensor (i.e. output of `layers.Input()`) - to use as image input for the model. + weights file to be loaded, defaults to `None`. + input_tensor: Tensor, Optional Keras tensor (i.e. output of + `layers.Input()`) to use as image input for the model. input_shape: Optional shape tuple, defaults to (None, None, 3). It should have exactly 3 inputs channels. pooling: Optional pooling mode for feature extraction - when `include_top` is `False`. Defaults to None. + when `include_top` is `False`, defaults to None. - `None` means that the output of the model will be the 4D tensor output of the last convolutional layer. diff --git a/keras_cv/models/segmentation/deeplab.py b/keras_cv/models/segmentation/deeplab.py index 363075e4c5..79620d1b67 100644 --- a/keras_cv/models/segmentation/deeplab.py +++ b/keras_cv/models/segmentation/deeplab.py @@ -27,18 +27,20 @@ class DeepLabV3(keras.Model): """A segmentation model based on DeepLab v3. Args: - num_classes: int, the number of classes for the detection model. Note that - the num_classes doesn't contain the background class, and the classes - from the data should be represented by integers with range + num_classes: int, the number of classes for the detection model. Note + that the num_classes doesn't contain the background class, and the + classes from the data should be represented by integers with range [0, num_classes). - backbone: Optional backbone network for the model. Should be a KerasCV model. + backbone: Optional backbone network for the model. Should be a KerasCV + model. weights: Weights for the complete DeepLabV3 model. one of `None` (random initialization), a pretrained weight file path, or a reference to - pre-trained weights (e.g. 'imagenet/classification' or 'voc/segmentation') (see available - pre-trained weights in weights.py) - spatial_pyramid_pooling: Also known as Atrous Spatial Pyramid Pooling (ASPP). - Performs spatial pooling on different spatial levels in the pyramid, with - dilation. + pre-trained weights (e.g. 'imagenet/classification' or + 'voc/segmentation') (see available pre-trained weights in + weights.py) + spatial_pyramid_pooling: Also known as Atrous Spatial Pyramid Pooling + (ASPP). Performs spatial pooling on different spatial levels in the + pyramid, with dilation. segmentation_head: Optional `keras.Layer` that predict the segmentation mask based on feature from backbone and feature from decoder. """ @@ -59,15 +61,17 @@ def __init__( if not isinstance(backbone, keras.layers.Layer): raise ValueError( "Argument `backbone` must be a `keras.layers.Layer` instance. " - f"Received instead backbone={backbone} (of type {type(backbone)})." + f"Received instead backbone={backbone} (of type " + f"{type(backbone)})." ) if weights and not tf.io.gfile.exists( parse_weights(weights, True, "deeplabv3") ): raise ValueError( - "The `weights` argument should be either `None` or the path to the " - "weights file to be loaded. Weights file not found at location: {weights}" + "The `weights` argument should be either `None` or the path " + "to the weights file to be loaded. Weights file not found at " + "location: {weights}" ) inputs = utils.parse_model_inputs(input_shape, input_tensor) @@ -78,9 +82,9 @@ def __init__( if input_shape[0] is None and input_shape[1] is None: raise ValueError( - "Input shapes for both the backbone and DeepLabV3 cannot be `None`. " - f"Received: input_shape={input_shape} and " - f"backbone.input_shape={backbone.input_shape[1:]}" + "Input shapes for both the backbone and DeepLabV3 cannot be " + "`None`. Received: input_shape={input_shape} and " + "backbone.input_shape={backbone.input_shape[1:]}" ) height = input_shape[0] @@ -189,27 +193,30 @@ class SegmentationHead(layers.Layer): segmentation mask (pixel level classifications) as the output for the model. Args: - num_classes: int, number of output classes for the prediction. This should - include all the classes (e.g. background) for the model to predict. - convolutions: int, number of `Conv2D` layers that are stacked before the final - classification layer. Defaults to 2. - filters: int, number of filter/channels for the the conv2D layers. + num_classes: int, number of output classes for the prediction. This + should include all the classes (e.g. background) for the model to + predict. + convolutions: int, number of `Conv2D` layers that are stacked before the + final classification layer, defaults to 2. + filters: int, number of filter/channels for the conv2D layers. Defaults to 256. - activations: str or function, activation functions between the - conv2D layers and the final classification layer. Defaults to `"relu"`. - output_scale_factor: int, or a pair of ints. Factor for upsampling the output mask. - This is useful to scale the output mask back to same size as the input - image. When single int is provided, the mask will be scaled with same - ratio on both width and height. When a pair of ints are provided, they will - be parsed as `(height_factor, width_factor)`. Defaults to `None`, which means - no resize will happen to the output mask tensor. - kernel_size: int, the kernel size to be used in each of the convolutional blocks. - Defaults to 3. - use_bias: boolean, whether to use bias or not in each of the convolutional blocks. - Defaults to False since the blocks use `BatchNormalization` - after each convolution, rendering bias obsolete. + activations: str or function, activation functions between the conv2D + layers and the final classification layer, defaults to `"relu"`. + output_scale_factor: int, or a pair of ints. Factor for upsampling the + output mask. This is useful to scale the output mask back to same + size as the input image. When single int is provided, the mask will + be scaled with same ratio on both width and height. When a pair of + ints are provided, they will be parsed as `(height_factor, + width_factor)`. Defaults to `None`, which means no resize will + happen to the output mask tensor. + kernel_size: int, the kernel size to be used in each of the + convolutional blocks, defaults to 3. + use_bias: boolean, whether to use bias or not in each of the + convolutional blocks, defaults to False since the blocks use + `BatchNormalization` after each convolution, rendering bias + obsolete. activation: str or function, activation to apply in the classification - layer (output of the head). Defaults to `"softmax"`. + layer (output of the head), defaults to `"softmax"`. Examples: @@ -223,7 +230,8 @@ class SegmentationHead(layers.Layer): head = SegmentationHead(num_classes=11) output = head(inputs) - # output tensor has shape [2, 32, 32, 11]. It has the same resolution as the p3. + # output tensor has shape [2, 32, 32, 11]. It has the same resolution as + the p3. ``` """ @@ -274,8 +282,8 @@ def __init__( use_bias=False, padding="same", activation=self.activation, - # Force the dtype of the classification head to float32 to avoid the NAN loss - # issue when used with mixed precision API. + # Force the dtype of the classification head to float32 to avoid the + # NAN loss issue when used with mixed precision API. dtype=tf.float32, ) @@ -284,9 +292,10 @@ def __init__( def call(self, inputs): """Forward path for the segmentation head. - For now, it accepts the output from the decoder only, which is a dict with int - key and tensor as value (level-> processed feature output). The head will use the - lowest level of feature output as the input for the head. + For now, it accepts the output from the decoder only, which is a dict + with int key and tensor as value (level-> processed feature output). The + head will use the lowest level of feature output as the input for the + head. """ if not isinstance(inputs, dict): raise ValueError( diff --git a/keras_cv/models/segmentation/deeplab_test.py b/keras_cv/models/segmentation/deeplab_test.py index 90981c04e2..47a319eb70 100644 --- a/keras_cv/models/segmentation/deeplab_test.py +++ b/keras_cv/models/segmentation/deeplab_test.py @@ -45,7 +45,8 @@ def test_greyscale_input(self): def test_missing_input_shapes(self): with self.assertRaisesRegex( ValueError, - "Input shapes for both the backbone and DeepLabV3 cannot be `None`.", + "Input shapes for both the backbone and DeepLabV3 " + "cannot be `None`.", ): backbone = ResNet50V2Backbone() segmentation.DeepLabV3(num_classes=11, backbone=backbone) diff --git a/keras_cv/models/stable_diffusion/clip_tokenizer.py b/keras_cv/models/stable_diffusion/clip_tokenizer.py index 0f0fa26ae6..61d7910fca 100644 --- a/keras_cv/models/stable_diffusion/clip_tokenizer.py +++ b/keras_cv/models/stable_diffusion/clip_tokenizer.py @@ -11,7 +11,8 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -"""This code is taken nearly verbatim from https://github.com/divamgupta/stable-diffusion-tensorflow.""" +"""This code is taken nearly verbatim from +https://github.com/divamgupta/stable-diffusion-tensorflow.""" import gzip import html @@ -26,10 +27,11 @@ def bytes_to_unicode(): """Return a list of utf-8 bytes and a corresponding list of unicode strings. The reversible bpe codes work on unicode strings. - This means you need a large # of unicode characters in your vocab if you want to avoid UNKs. - When you're at something like a 10B token dataset you end up needing around 5K for decent coverage. - This is a signficant percentage of your normal, say, 32K bpe vocab. - To avoid that, we want lookup tables between utf-8 bytes and unicode strings. + This means you need a large # of unicode characters in your vocab if you + want to avoid UNKs. When you're at something like a 10B token dataset you + end up needing around 5K for decent coverage. This is a significant + percentage of your normal, say, 32K bpe vocab. To avoid that, we want + lookup tables between utf-8 bytes and unicode strings. And avoids mapping to whitespace/control characters the bpe code barfs on. """ bs = ( @@ -51,7 +53,8 @@ def bytes_to_unicode(): def get_pairs(word): """Return set of symbol pairs in a word. - A word is represented as tuple of symbols (symbols being variable-length strings). + A word is represented as tuple of symbols(symbols being variable-length + strings). """ pairs = set() prev_char = word[0] @@ -76,8 +79,8 @@ class SimpleTokenizer: def __init__(self, bpe_path=None): bpe_path = bpe_path or keras.utils.get_file( "bpe_simple_vocab_16e6.txt.gz", - "https://github.com/openai/CLIP/blob/main/clip/bpe_simple_vocab_16e6.txt.gz?raw=true", - file_hash="924691ac288e54409236115652ad4aa250f48203de50a9e4722a6ecd48d6804a", + "https://github.com/openai/CLIP/blob/main/clip/bpe_simple_vocab_16e6.txt.gz?raw=true", # noqa: E501 + file_hash="924691ac288e54409236115652ad4aa250f48203de50a9e4722a6ecd48d6804a", # noqa: E501 ) self.byte_encoder = bytes_to_unicode() self.byte_decoder = {v: k for k, v in self.byte_encoder.items()} diff --git a/keras_cv/models/stable_diffusion/decoder.py b/keras_cv/models/stable_diffusion/decoder.py index 708af0efe9..fe619d324a 100644 --- a/keras_cv/models/stable_diffusion/decoder.py +++ b/keras_cv/models/stable_diffusion/decoder.py @@ -14,7 +14,7 @@ from tensorflow import keras -from keras_cv.models.stable_diffusion.__internal__.layers.attention_block import ( +from keras_cv.models.stable_diffusion.__internal__.layers.attention_block import ( # noqa: E501 AttentionBlock, ) from keras_cv.models.stable_diffusion.__internal__.layers.padded_conv2d import ( @@ -63,7 +63,7 @@ def __init__(self, img_height, img_width, name=None, download_weights=True): if download_weights: decoder_weights_fpath = keras.utils.get_file( - origin="https://huggingface.co/fchollet/stable-diffusion/resolve/main/kcv_decoder.h5", - file_hash="ad350a65cc8bc4a80c8103367e039a3329b4231c2469a1093869a345f55b1962", + origin="https://huggingface.co/fchollet/stable-diffusion/resolve/main/kcv_decoder.h5", # noqa: E501 + file_hash="ad350a65cc8bc4a80c8103367e039a3329b4231c2469a1093869a345f55b1962", # noqa: E501 ) self.load_weights(decoder_weights_fpath) diff --git a/keras_cv/models/stable_diffusion/diffusion_model.py b/keras_cv/models/stable_diffusion/diffusion_model.py index 9c01f1c27b..25b5241aeb 100644 --- a/keras_cv/models/stable_diffusion/diffusion_model.py +++ b/keras_cv/models/stable_diffusion/diffusion_model.py @@ -108,8 +108,8 @@ def __init__( if download_weights: diffusion_model_weights_fpath = keras.utils.get_file( - origin="https://huggingface.co/fchollet/stable-diffusion/resolve/main/kcv_diffusion_model.h5", - file_hash="8799ff9763de13d7f30a683d653018e114ed24a6a819667da4f5ee10f9e805fe", + origin="https://huggingface.co/fchollet/stable-diffusion/resolve/main/kcv_diffusion_model.h5", # noqa: E501 + file_hash="8799ff9763de13d7f30a683d653018e114ed24a6a819667da4f5ee10f9e805fe", # noqa: E501 ) self.load_weights(diffusion_model_weights_fpath) @@ -202,8 +202,8 @@ def __init__( if download_weights: diffusion_model_weights_fpath = keras.utils.get_file( - origin="https://huggingface.co/ianstenbit/keras-sd2.1/resolve/main/diffusion_model_v2_1.h5", - file_hash="c31730e91111f98fe0e2dbde4475d381b5287ebb9672b1821796146a25c5132d", + origin="https://huggingface.co/ianstenbit/keras-sd2.1/resolve/main/diffusion_model_v2_1.h5", # noqa: E501 + file_hash="c31730e91111f98fe0e2dbde4475d381b5287ebb9672b1821796146a25c5132d", # noqa: E501 ) self.load_weights(diffusion_model_weights_fpath) diff --git a/keras_cv/models/stable_diffusion/image_encoder.py b/keras_cv/models/stable_diffusion/image_encoder.py index 4f50635e4b..80c920af22 100644 --- a/keras_cv/models/stable_diffusion/image_encoder.py +++ b/keras_cv/models/stable_diffusion/image_encoder.py @@ -14,7 +14,7 @@ from tensorflow import keras -from keras_cv.models.stable_diffusion.__internal__.layers.attention_block import ( +from keras_cv.models.stable_diffusion.__internal__.layers.attention_block import ( # noqa: E501 AttentionBlock, ) from keras_cv.models.stable_diffusion.__internal__.layers.padded_conv2d import ( @@ -51,16 +51,17 @@ def __init__(self, img_height=512, img_width=512, download_weights=True): keras.layers.Activation("swish"), PaddedConv2D(8, 3, padding=1), PaddedConv2D(8, 1), - # TODO(lukewood): can this be refactored to be a Rescaling layer? - # Perhaps some sort of rescale and gather? - # Either way, we may need a lambda to gather the first 4 dimensions. + # TODO(lukewood): can this be refactored to be a Rescaling + # layer? Perhaps some sort of rescale and gather? + # Either way, we may need a lambda to gather the first 4 + # dimensions. keras.layers.Lambda(lambda x: x[..., :4] * 0.18215), ] ) if download_weights: image_encoder_weights_fpath = keras.utils.get_file( - origin="https://huggingface.co/fchollet/stable-diffusion/resolve/main/vae_encoder.h5", - file_hash="c60fb220a40d090e0f86a6ab4c312d113e115c87c40ff75d11ffcf380aab7ebb", + origin="https://huggingface.co/fchollet/stable-diffusion/resolve/main/vae_encoder.h5", # noqa: E501 + file_hash="c60fb220a40d090e0f86a6ab4c312d113e115c87c40ff75d11ffcf380aab7ebb", # noqa: E501 ) self.load_weights(image_encoder_weights_fpath) diff --git a/keras_cv/models/stable_diffusion/noise_scheduler.py b/keras_cv/models/stable_diffusion/noise_scheduler.py index ba78f169e0..fdd8c6f571 100644 --- a/keras_cv/models/stable_diffusion/noise_scheduler.py +++ b/keras_cv/models/stable_diffusion/noise_scheduler.py @@ -14,7 +14,7 @@ """StableDiffusion Noise scheduler Adapted from https://github.com/huggingface/diffusers/blob/v0.3.0/src/diffusers/schedulers/scheduling_ddpm.py#L56 -""" +""" # noqa: E501 import tensorflow as tf @@ -25,15 +25,16 @@ class NoiseScheduler: train_timesteps: number of diffusion steps used to train the model. beta_start: the starting `beta` value of inference. beta_end: the final `beta` value. - beta_schedule: - the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from - `linear` or `quadratic`. - betas: a complete set of betas, in lieu of using one of the existing schedules. - variance_type: - options to clip the variance used when adding noise to the denoised sample. Choose from `fixed_small`, - `fixed_small_log`, `fixed_large`, `fixed_large_log`, `learned` or `learned_range`. - clip_sample: - option to clip predicted sample between -1 and 1 for numerical stability. + beta_schedule: the beta schedule, a mapping from a beta range to a + sequence of betas for stepping the model. Choose from `linear` or + `quadratic`. + betas: a complete set of betas, in lieu of using one of the existing + schedules. + variance_type: options to clip the variance used when adding noise to + the denoised sample. Choose from `fixed_small`, `fixed_small_log`, + `fixed_large`, `fixed_large_log`, `learned` or `learned_range`. + clip_sample: option to clip predicted sample between -1 and 1 for + numerical stability. """ def __init__( @@ -111,13 +112,17 @@ def step( predict_epsilon=True, ): """ - Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion - process from the learned model outputs (usually the predicted noise). + Predict the sample at the previous timestep by reversing the SDE. Core + function to propagate the diffusion process from the learned model + outputs (usually the predicted noise). Args: - model_output: a Tensor containing direct output from learned diffusion model + model_output: a Tensor containing direct output from learned + diffusion model timestep: current discrete timestep in the diffusion chain. - sample: a Tensor containing the current instance of sample being created by diffusion process. - predict_epsilon: whether the model is predicting noise (epsilon) or samples + sample: a Tensor containing the current instance of sample being + created by diffusion process. + predict_epsilon: whether the model is predicting noise (epsilon) or + samples Returns: The predicted sample at the previous timestep """ @@ -143,7 +148,7 @@ def step( beta_prod_prev = 1 - alpha_prod_prev # 2. compute predicted original sample from predicted noise also called - # "predicted x_0" of formula (15) from https://arxiv.org/pdf/2006.11239.pdf + # "predicted x_0" of formula (15) from https://arxiv.org/pdf/2006.11239.pdf # noqa: E501 if predict_epsilon: pred_original_sample = ( sample - beta_prod ** (0.5) * model_output @@ -155,7 +160,8 @@ def step( if self.clip_sample: pred_original_sample = tf.clip_by_value(pred_original_sample, -1, 1) - # 4. Compute coefficients for pred_original_sample x_0 and current sample x_t + # 4. Compute coefficients for pred_original_sample x_0 and current + # sample x_t # See formula (7) from https://arxiv.org/pdf/2006.11239.pdf pred_original_sample_coeff = ( alpha_prod_prev ** (0.5) * self.betas[timestep] diff --git a/keras_cv/models/stable_diffusion/stable_diffusion.py b/keras_cv/models/stable_diffusion/stable_diffusion.py index 3b71b40b26..55b94d0d1b 100644 --- a/keras_cv/models/stable_diffusion/stable_diffusion.py +++ b/keras_cv/models/stable_diffusion/stable_diffusion.py @@ -15,10 +15,13 @@ Credits: -- Original implementation: https://github.com/CompVis/stable-diffusion -- Initial TF/Keras port: https://github.com/divamgupta/stable-diffusion-tensorflow +- Original implementation: + https://github.com/CompVis/stable-diffusion +- Initial TF/Keras port: + https://github.com/divamgupta/stable-diffusion-tensorflow -The current implementation is a rewrite of the initial TF/Keras port by Divam Gupta. +The current implementation is a rewrite of the initial TF/Keras port by +Divam Gupta. """ import math @@ -138,19 +141,18 @@ def generate_image( Args: encoded_text: Tensor of shape (`batch_size`, 77, 768), or a Tensor - of shape (77, 768). When the batch axis is omitted, the same encoded - text will be used to produce every generated image. - batch_size: number of images to generate. Default: 1. + of shape (77, 768). When the batch axis is omitted, the same + encoded text will be used to produce every generated image. + batch_size: int, number of images to generate, defaults to 1. negative_prompt: a string containing information to negatively guide - the image generation (e.g. by removing or altering certain aspects - of the generated image). - Default: None. - num_steps: number of diffusion steps (controls image quality). - Default: 50. - unconditional_guidance_scale: float controling how closely the image - should adhere to the prompt. Larger values result in more + the image generation (e.g. by removing or altering certain + aspects of the generated image), defaults to None. + num_steps: int, number of diffusion steps (controls image quality), + defaults to 50. + unconditional_guidance_scale: float, controlling how closely the + image should adhere to the prompt. Larger values result in more closely adhering to the prompt, but will make the image noisier. - Default: 7.5. + Defaults to 7.5. diffusion_noise: Tensor of shape (`batch_size`, img_height // 8, img_width // 8, 4), or a Tensor of shape (img_height // 8, img_width // 8, 4). Optional custom noise to seed the diffusion @@ -249,45 +251,48 @@ def inpaint( seed=None, verbose=True, ): - """Inpaints a masked section of the provided image based on the provided prompt. + """Inpaints a masked section of the provided image based on the provided + prompt. Note that this currently does not support mixed precision. Args: prompt: A string representing the prompt for generation. image: Tensor of shape (`batch_size`, `image_height`, `image_width`, - 3) with RGB values in [0, 255]. When the batch is omitted, the same - image will be used as the starting image. + 3) with RGB values in [0, 255]. When the batch is omitted, the + same image will be used as the starting image. mask: Tensor of shape (`batch_size`, `image_height`, `image_width`) - with binary values 0 or 1. When the batch is omitted, the same mask - will be used on all images. + with binary values 0 or 1. When the batch is omitted, the same + mask will be used on all images. negative_prompt: a string containing information to negatively guide - the image generation (e.g. by removing or altering certain aspects - of the generated image). - Default: None. - num_resamples: number of times to resample the generated mask region. - Increasing the number of resamples improves the semantic fit of the - generated mask region w.r.t the rest of the image. Default: 1. - batch_size: number of images to generate. Default: 1. - num_steps: number of diffusion steps (controls image quality). - Default: 25. - unconditional_guidance_scale: float controlling how closely the image - should adhere to the prompt. Larger values result in more + the image generation (e.g. by removing or altering certain + aspects of the generated image), defaults to None. + num_resamples: int, number of times to resample the generated mask + region. Increasing the number of resamples improves the semantic + fit of the generated mask region w.r.t the rest of the image. + Defaults to 1. + batch_size: int, number of images to generate, defaults to 1. + num_steps: int, number of diffusion steps (controls image quality), + defaults to 25. + unconditional_guidance_scale: float, controlling how closely the + image should adhere to the prompt. Larger values result in more closely adhering to the prompt, but will make the image noisier. - Default: 7.5. + Defaults to 7.5. diffusion_noise: (Optional) Tensor of shape (`batch_size`, img_height // 8, img_width // 8, 4), or a Tensor of shape (img_height // 8, img_width // 8, 4). Optional custom noise to seed the diffusion process. When the batch axis is omitted, the - same noise will be used to seed diffusion for every generated image. - seed: (Optional) integer which is used to seed the random generation of - diffusion noise, only to be specified if `diffusion_noise` is None. - verbose: whether to print progress bar. Default: True. + same noise will be used to seed diffusion for every generated + image. + seed: (Optional) integer which is used to seed the random generation + of diffusion noise, only to be specified if `diffusion_noise` is + None. + verbose: bool, whether to print progress bar, defaults to True. """ if diffusion_noise is not None and seed is not None: raise ValueError( "Please pass either diffusion_noise or seed to inpaint(), seed " - "is only used to generate diffusion noise when it is not provided. " - "Received both diffusion_noise and seed." + "is only used to generate diffusion noise when it is not" + "provided. Received both diffusion_noise and seed." ) encoded_text = self.encode_text(prompt) @@ -405,7 +410,8 @@ def _get_unconditional_context(self): return unconditional_context def _expand_tensor(self, text_embedding, batch_size): - """Extends a tensor by repeating it to fit the shape of the given batch size.""" + """Extends a tensor by repeating it to fit the shape of the given batch + size.""" text_embedding = tf.squeeze(text_embedding) if text_embedding.shape.rank == 2: text_embedding = tf.repeat( @@ -440,8 +446,9 @@ def diffusion_model(self): @property def decoder(self): - """decoder returns the diffusion image decoder model with pretrained weights. - Can be overriden for tasks where the decoder needs to be modified. + """decoder returns the diffusion image decoder model with pretrained + weights. Can be overriden for tasks where the decoder needs to be + modified. """ if self._decoder is None: self._decoder = Decoder(self.img_height, self.img_width) @@ -452,7 +459,8 @@ def decoder(self): @property def tokenizer(self): """tokenizer returns the tokenizer used for text inputs. - Can be overriden for tasks like textual inversion where the tokenizer needs to be modified. + Can be overriden for tasks like textual inversion where the tokenizer + needs to be modified. """ if self._tokenizer is None: self._tokenizer = SimpleTokenizer() @@ -503,18 +511,19 @@ class StableDiffusion(StableDiffusionBase): future changes to these APIs. Stable Diffusion is a powerful image generation model that can be used, - among other things, to generate pictures according to a short text description - (called a "prompt"). + among other things, to generate pictures according to a short text + description (called a "prompt"). Arguments: - img_height: Height of the images to generate, in pixel. Note that only - multiples of 128 are supported; the value provided will be rounded - to the nearest valid value. Default: 512. - img_width: Width of the images to generate, in pixel. Note that only - multiples of 128 are supported; the value provided will be rounded - to the nearest valid value. Default: 512. - jit_compile: Whether to compile the underlying models to XLA. - This can lead to a significant speedup on some systems. Default: False. + img_height: int, height of the images to generate, in pixel. Note that + only multiples of 128 are supported; the value provided will be + rounded to the nearest valid value. Defaults to 512. + img_width: int, width of the images to generate, in pixel. Note that + only multiples of 128 are supported; the value provided will be + rounded to the nearest valid value. Defaults to 512. + jit_compile: bool, whether to compile the underlying models to XLA. + This can lead to a significant speedup on some systems. Defaults to + False. Example: @@ -536,7 +545,7 @@ class StableDiffusion(StableDiffusionBase): References: - [About Stable Diffusion](https://stability.ai/blog/stable-diffusion-announcement) - [Original implementation](https://github.com/CompVis/stable-diffusion) - """ + """ # noqa: E501 def __init__( self, @@ -548,7 +557,7 @@ def __init__( print( "By using this model checkpoint, you acknowledge that its usage is " "subject to the terms of the CreativeML Open RAIL-M license at " - "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/LICENSE" + "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/LICENSE" # noqa: E501 ) @property @@ -566,7 +575,8 @@ def text_encoder(self): @property def diffusion_model(self): """diffusion_model returns the diffusion model with pretrained weights. - Can be overriden for tasks where the diffusion model needs to be modified. + Can be overriden for tasks where the diffusion model needs to be + modified. """ if self._diffusion_model is None: self._diffusion_model = DiffusionModel( @@ -581,23 +591,24 @@ class StableDiffusionV2(StableDiffusionBase): """Keras implementation of Stable Diffusion v2. Note that the StableDiffusion API, as well as the APIs of the sub-components - of StableDiffusionV2 (e.g. ImageEncoder, DiffusionModelV2) should be considered - unstable at this point. We do not guarantee backwards compatability for - future changes to these APIs. + of StableDiffusionV2 (e.g. ImageEncoder, DiffusionModelV2) should be + considered unstable at this point. We do not guarantee backwards + compatability for future changes to these APIs. Stable Diffusion is a powerful image generation model that can be used, - among other things, to generate pictures according to a short text description - (called a "prompt"). + among other things, to generate pictures according to a short text + description (called a "prompt"). Arguments: - img_height: Height of the images to generate, in pixel. Note that only - multiples of 128 are supported; the value provided will be rounded - to the nearest valid value. Default: 512. - img_width: Width of the images to generate, in pixel. Note that only - multiples of 128 are supported; the value provided will be rounded - to the nearest valid value. Default: 512. - jit_compile: Whether to compile the underlying models to XLA. - This can lead to a significant speedup on some systems. Default: False. + img_height: int, height of the images to generate, in pixel. Note that + only multiples of 128 are supported; the value provided will be + rounded to the nearest valid value. Defaults to 512. + img_width: int, width of the images to generate, in pixel. Note that + only multiples of 128 are supported; the value provided will be + rounded to the nearest valid value. Defaults to 512. + jit_compile: bool, whether to compile the underlying models to XLA. + This can lead to a significant speedup on some systems. Defaults to + False. Example: ```python @@ -619,7 +630,7 @@ class StableDiffusionV2(StableDiffusionBase): - [About Stable Diffusion](https://stability.ai/blog/stable-diffusion-announcement) - [Original implementation](https://github.com/Stability-AI/stablediffusion) - """ + """ # noqa: E501 def __init__( self, @@ -649,7 +660,8 @@ def text_encoder(self): @property def diffusion_model(self): """diffusion_model returns the diffusion model with pretrained weights. - Can be overriden for tasks where the diffusion model needs to be modified. + Can be overriden for tasks where the diffusion model needs to be + modified. """ if self._diffusion_model is None: self._diffusion_model = DiffusionModelV2( diff --git a/keras_cv/models/stable_diffusion/text_encoder.py b/keras_cv/models/stable_diffusion/text_encoder.py index b458ce6043..307e3b35c1 100644 --- a/keras_cv/models/stable_diffusion/text_encoder.py +++ b/keras_cv/models/stable_diffusion/text_encoder.py @@ -35,8 +35,8 @@ def __init__( if download_weights: text_encoder_weights_fpath = keras.utils.get_file( - origin="https://huggingface.co/fchollet/stable-diffusion/resolve/main/kcv_encoder.h5", - file_hash="4789e63e07c0e54d6a34a29b45ce81ece27060c499a709d556c7755b42bb0dc4", + origin="https://huggingface.co/fchollet/stable-diffusion/resolve/main/kcv_encoder.h5", # noqa: E501 + file_hash="4789e63e07c0e54d6a34a29b45ce81ece27060c499a709d556c7755b42bb0dc4", # noqa: E501 ) self.load_weights(text_encoder_weights_fpath) @@ -59,8 +59,8 @@ def __init__( if download_weights: text_encoder_weights_fpath = keras.utils.get_file( - origin="https://huggingface.co/ianstenbit/keras-sd2.1/resolve/main/text_encoder_v2_1.h5", - file_hash="985002e68704e1c5c3549de332218e99c5b9b745db7171d5f31fcd9a6089f25b", + origin="https://huggingface.co/ianstenbit/keras-sd2.1/resolve/main/text_encoder_v2_1.h5", # noqa: E501 + file_hash="985002e68704e1c5c3549de332218e99c5b9b745db7171d5f31fcd9a6089f25b", # noqa: E501 ) self.load_weights(text_encoder_weights_fpath) diff --git a/keras_cv/models/task.py b/keras_cv/models/task.py index 6a999940f7..8737862557 100644 --- a/keras_cv/models/task.py +++ b/keras_cv/models/task.py @@ -61,12 +61,14 @@ def presets(cls): @classproperty def presets_with_weights(cls): - """Dictionary of preset names and configurations that include weights.""" + """Dictionary of preset names and configurations that include + weights.""" return {} @classproperty def backbone_presets(cls): - """Dictionary of preset names and configurations for compatible backbones.""" + """Dictionary of preset names and configurations for compatible + backbones.""" return {} @classmethod @@ -76,7 +78,8 @@ def from_preset( load_weights=None, **kwargs, ): - """Instantiate {{model_name}} model from preset architecture and weights. + """Instantiate {{model_name}} model from preset architecture and + weights. Args: preset: string. Must be one of "{{preset_names}}". @@ -150,11 +153,11 @@ def from_preset( return model def __init_subclass__(cls, **kwargs): - # Use __init_subclass__ to setup a correct docstring for from_preset. + # Use __init_subclass__ to set up a correct docstring for from_preset. super().__init_subclass__(**kwargs) # If the subclass does not define from_preset, assign a wrapper so that - # each class can have an distinct docstring. + # each class can have a distinct docstring. if "from_preset" not in cls.__dict__: def from_preset(calling_cls, *args, **kwargs): diff --git a/keras_cv/models/utils.py b/keras_cv/models/utils.py index cc728941c7..df11081a95 100644 --- a/keras_cv/models/utils.py +++ b/keras_cv/models/utils.py @@ -31,24 +31,26 @@ def parse_model_inputs(input_shape, input_tensor): def as_backbone(self, min_level=None, max_level=None): """Convert the application model into a model backbone for other tasks. The backbone model will usually take same inputs as the original application - model, but produce multiple outputs, one for each feature level. Those outputs - can be feed to network downstream, like FPN and RPN. - The output of the backbone model will be a dict with int as key and tensor as - value. The int key represent the level of the feature output. - A typical feature pyramid has five levels corresponding to scales P3, P4, P5, - P6, P7 in the backbone. Scale Pn represents a feature map 2n times smaller in - width and height than the input image. + model, but produce multiple outputs, one for each feature level. Those + outputs can be feed to network downstream, like FPN and RPN. The output of + the backbone model will be a dict with int as key and tensor as value. The + int key represent the level of the feature output. A typical feature pyramid + has five levels corresponding to scales P3, P4, P5, P6, P7 in the backbone. + Scale Pn represents a feature map 2n times smaller in width and height than + the input image. Args: - min_level: optional int, the lowest level of feature to be included in the - output. Default to model's lowest feature level (based on the model structure). - max_level: optional int, the highest level of feature to be included in the - output. Default to model's highest feature level (based on the model structure). + min_level: optional int, the lowest level of feature to be included in + the output, defaults to model's lowest feature level + (based on the model structure). + max_level: optional int, the highest level of feature to be included in + the output, defaults to model's highest feature level + (based on the model structure). Returns: a `keras.Model` which has dict as outputs. Raises: - ValueError: When the model is lack of information for feature level, and can't - be converted to backbone model, or the min_level/max_level param is out of - range based on the model structure. + ValueError: When the model is lack of information for feature level, and + can't be converted to backbone model, or the min_level/max_level param + is out of range based on the model structure. """ if hasattr(self, "_backbone_level_outputs"): backbone_level_outputs = self._backbone_level_outputs diff --git a/keras_cv/models/vgg16.py b/keras_cv/models/vgg16.py index 20fe6d360d..a83e9f5ad4 100644 --- a/keras_cv/models/vgg16.py +++ b/keras_cv/models/vgg16.py @@ -15,8 +15,9 @@ """VGG16 model for KerasCV. Reference: - - [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) (ICLR 2015) -""" + - [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) + (ICLR 2015) +""" # noqa: E501 import tensorflow as tf from tensorflow import keras @@ -42,9 +43,10 @@ def apply_vgg_block( num_layers: int, number of CNN layers in the block filters: int, filter size of each CNN layer in block kernel_size: int (or) tuple, kernel size for CNN layer in block - activation: str (or) callable, activation function for each CNN layer in block + activation: str (or) callable, activation function for each CNN layer in + block padding: str (or) callable, padding function for each CNN layer in block - max_pool: bool, whether or not to add MaxPooling2D layer at end of block. + max_pool: bool, whether to add MaxPooling2D layer at end of block name: str, name of the block Returns: @@ -67,19 +69,22 @@ def apply_vgg_block( class VGG16(keras.Model): """ Reference: - - [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) (ICLR 2015) + - [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) + (ICLR 2015) This class represents a Keras VGG16 model. Args: - include_rescaling: bool, whether or not to Rescale the inputs.If set to True, - inputs will be passed through a `Rescaling(1/255.0)` layer. + include_rescaling: bool, whether to rescale the inputs. If set to + True, inputs will be passed through a `Rescaling(1/255.0)` layer. include_top: bool, whether to include the 3 fully-connected - layers at the top of the network. If provided, num_classes must be provided. - num_classes: int, optional number of classes to classify images into, only to be - specified if `include_top` is True. - weights: os.PathLike or None, one of `None` (random initialization), or a pretrained weight file path. + layers at the top of the network. If provided, num_classes must be + provided. + num_classes: int, optional number of classes to classify images into, + only to be specified if `include_top` is True. + weights: os.PathLike or None, one of `None` (random initialization), or a + pretrained weight file path. input_shape: tuple, optional shape tuple, defaults to (224, 224, 3). - input_tensor: Tensor, optional Keras tensor (i.e. output of `layers.Input()`) - to use as image input for the model. + input_tensor: Tensor, optional Keras tensor (i.e. output of + `layers.Input()`) to use as image input for the model. pooling: bool, Optional pooling mode for feature extraction when `include_top` is `False`. - `None` means that the output of the model will be @@ -96,10 +101,10 @@ class VGG16(keras.Model): `classifier_activation=None` to return the logits of the "top" layer. When loading pretrained weights, `classifier_activation` can only be `None` or `"softmax"`. - name: (Optional) name to pass to the model. Defaults to "VGG16". + name: (Optional) name to pass to the model, defaults to "VGG16". Returns: A `keras.Model` instance. - """ + """ # noqa: E501 def __init__( self, @@ -116,8 +121,9 @@ def __init__( ): if weights and not tf.io.gfile.exists(weights): raise ValueError( - "The `weights` argument should be either `None` or the path to the " - "weights file to be loaded. Weights file not found at location: {weights}" + "The `weights` argument should be either `None` or the path " + "to the weights file to be loaded. Weights file not found at " + "location: {weights}" ) if include_top and not num_classes: diff --git a/keras_cv/models/vgg19.py b/keras_cv/models/vgg19.py index 2fd0b3f7bb..fbef3d27bb 100644 --- a/keras_cv/models/vgg19.py +++ b/keras_cv/models/vgg19.py @@ -15,8 +15,9 @@ """VGG19 model for KerasCV. Reference: - - [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) (ICLR 2015) -""" + - [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) + (ICLR 2015) +""" # noqa: E501 import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers @@ -29,19 +30,22 @@ class VGG19(keras.Model): """ Reference: - - [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) (ICLR 2015) + - [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) + (ICLR 2015) This class represents a Keras VGG19 model. Args: - include_rescaling: bool, whether or not to Rescale the inputs.If set to True, - inputs will be passed through a `Rescaling(1/255.0)` layer. + include_rescaling: bool, whether to rescale the inputs. If set to + True, inputs will be passed through a `Rescaling(1/255.0)` layer. include_top: bool, whether to include the 3 fully-connected - layers at the top of the network. If provided, num_classes must be provided. - num_classes: int, optional number of classes to classify images into, only to be - specified if `include_top` is True. - weights: os.PathLike or None, one of `None` (random initialization), or a pretrained weight file path. + layers at the top of the network. If provided, num_classes must be + provided. + num_classes: int, optional number of classes to classify images into, only + to be specified if `include_top` is True. + weights: os.PathLike or None, one of `None` (random initialization), or a + pretrained weight file path. input_shape: tuple, optional shape tuple, defaults to (224, 224, 3). - input_tensor: Tensor, optional Keras tensor (i.e. output of `layers.Input()`) - to use as image input for the model. + input_tensor: Tensor, optional Keras tensor (i.e. output of + `layers.Input()`) to use as image input for the model. pooling: bool, Optional pooling mode for feature extraction when `include_top` is `False`. - `None` means that the output of the model will be @@ -58,10 +62,10 @@ class VGG19(keras.Model): `classifier_activation=None` to return the logits of the "top" layer. When loading pretrained weights, `classifier_activation` can only be `None` or `"softmax"`. - name: (Optional) name to pass to the model. Defaults to "VGG19". + name: (Optional) name to pass to the model, defaults to "VGG19". Returns: A `keras.Model` instance. - """ + """ # noqa: E501 def __init__( self, @@ -78,8 +82,9 @@ def __init__( ): if weights and not tf.io.gfile.exists(weights): raise ValueError( - "The `weights` argument should be either `None` or the path to the " - "weights file to be loaded. Weights file not found at location: {weights}" + "The `weights` argument should be either `None` or the path " + "to the weights file to be loaded. Weights file not found at " + "location: {weights}" ) if include_top and not num_classes: diff --git a/keras_cv/models/vit.py b/keras_cv/models/vit.py index ee79657217..df2a7c2afe 100644 --- a/keras_cv/models/vit.py +++ b/keras_cv/models/vit.py @@ -13,9 +13,11 @@ # limitations under the License. """ViT (Vision Transformer) models for Keras. Reference: - - [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929v2) (ICLR 2021) - - [How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers](https://arxiv.org/abs/2106.10270) (CoRR 2021) -""" + - [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929v2) + (ICLR 2021) + - [How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers](https://arxiv.org/abs/2106.10270) + (CoRR 2021) +""" # noqa: E501 import tensorflow as tf from tensorflow import keras @@ -121,10 +123,12 @@ BASE_DOCSTRING = """Instantiates the {name} architecture. Reference: - - [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929v2) (ICLR 2021) + - [An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale](https://arxiv.org/abs/2010.11929v2) + (ICLR 2021) This function returns a Keras {name} model. - The naming convention of ViT models follows: ViTSize_Patch-size (i.e. ViTS16). + The naming convention of ViT models follows: ViTSize_Patch-size + (i.e. ViTS16). The following sizes were released in the original paper: - S (Small) - B (Base) @@ -133,46 +137,49 @@ - Ti (Tiny) - H (Huge) - The parameter configurations for all of these sizes, at patch sizes 16 and 32 are made available, following the naming convention - laid out above. + The parameter configurations for all of these sizes, at patch sizes 16 and + 32 are made available, following the naming convention laid out above. - For transfer learning use cases, make sure to read the [guide to transfer - learning & fine-tuning](https://keras.io/guides/transfer_learning/). + For transfer learning use cases, make sure to read the + [guide to transfer learning & fine-tuning](https://keras.io/guides/transfer_learning/). Args: - include_rescaling: bool, whether or not to Rescale the inputs. If set to True, - inputs will be passed through a `Rescaling(scale=1./255.0)` layer. Note that ViTs - expect an input range of `[0..1]` if rescaling isn't used. Regardless of whether - you supply `[0..1]` or the input is rescaled to `[0..1]`, the inputs will further be - rescaled to `[-1..1]`. - include_top: bool, whether to include the fully-connected layer at the top of the - network. If provided, num_classes must be provided. - num_classes: optional int, number of classes to classify images into, only to be - specified if `include_top` is True. + include_rescaling: bool, whether to rescale the inputs. If set to + True, inputs will be passed through a `Rescaling(scale=1./255.0)` + layer. Note that ViTs expect an input range of `[0..1]` if rescaling + isn't used. Regardless of whether you supply `[0..1]` or the input + is rescaled to `[0..1]`, the inputs will further be rescaled to + `[-1..1]`. + include_top: bool, whether to include the fully-connected layer at the + top of the network. If provided, num_classes must be provided. + num_classes: optional int, number of classes to classify images into, + only to be specified if `include_top` is True. weights: one of `None` (random initialization), a pretrained weight file - path, or a reference to pre-trained weights (e.g. 'imagenet/classification') - (see available pre-trained weights in weights.py). Note that the 'imagenet' - weights only work on an input shape of (224, 224, 3) due to the input shape dependent + path, or a reference to pre-trained weights + (e.g. 'imagenet/classification') (see available pre-trained weights + in weights.py). Note that the 'imagenet' weights only work on an + input shape of (224, 224, 3) due to the input shape dependent patching and flattening logic. input_shape: optional shape tuple, defaults to (None, None, 3). input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) to use as image input for the model. pooling: optional pooling mode for feature extraction when `include_top` is `False`. - - `None` means that the output of the model will be the 4D tensor output - of the last convolutional block. - - `avg` means that global average pooling will be applied to the output - of the last convolutional block, and thus the output of the model will - be a 2D tensor. + - `None` means that the output of the model will be the 4D tensor + output of the last convolutional block. + - `avg` means that global average pooling will be applied to the + output of the last convolutional block, and thus the output of + the model will be a 2D tensor. - `max` means that global max pooling will be applied. - `token_pooling`, default, means that the token at the start of the sequences is used instead of regular pooling. - name: (Optional) name to pass to the model. Defaults to "{name}". - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. + name: (Optional) name to pass to the model, defaults to "{name}". + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. Returns: A `keras.Model` instance. -""" +""" # noqa: E501 @keras.utils.register_keras_serializable(package="keras_cv.models") @@ -182,8 +189,8 @@ class ViT(keras.Model): Args: mlp_dim: the dimensionality of the hidden Dense layer in the transformer MLP head - include_rescaling: bool, whether or not to Rescale the inputs. If set to True, - inputs will be passed through a `Rescaling(1/255.0)` layer. + include_rescaling: bool, whether to rescale the inputs. If set to + True, inputs will be passed through a `Rescaling(1/255.0)` layer. name: string, model name. include_top: bool, whether to include the fully-connected layer at the top of the network. @@ -208,10 +215,10 @@ class ViT(keras.Model): num_classes: optional number of classes to classify images into, only to be specified if `include_top` is True. mlp_dim: - project_dim: the latent dimensionality to be projected into in the output - of each stacked transformer encoder - activation: the activation function to use in the first `layers.Dense` layer - in the MLP head of the transformer encoder + project_dim: the latent dimensionality to be projected into in the + output of each stacked transformer encoder + activation: the activation function to use in the first `layers.Dense` + layer in the MLP head of the transformer encoder attention_dropout: the dropout rate to apply to the `MultiHeadAttention` in each transformer encoder mlp_dropout: the dropout rate to apply between `layers.Dense` layers @@ -222,9 +229,10 @@ class ViT(keras.Model): in the Vision Transformer patch_size: the patch size to be supplied to the Patching layer to turn input images into a flattened sequence of patches - classifier_activation: A `str` or callable. The activation function to use - on the "top" layer. Ignored unless `include_top=True`. Set - `classifier_activation=None` to return the logits of the "top" layer. + classifier_activation: A `str` or callable. The activation function to + use on the "top" layer. Ignored unless `include_top=True`. Set + `classifier_activation=None` to return the logits of the "top" + layer. **kwargs: Pass-through keyword arguments to `keras.Model`. """ @@ -250,8 +258,9 @@ def __init__( ): if weights and not tf.io.gfile.exists(weights): raise ValueError( - "The `weights` argument should be either `None` or the path to the " - "weights file to be loaded. Weights file not found at location: {weights}" + "The `weights` argument should be either `None` or the path " + "to the weights file to be loaded. Weights file not found at " + "location: {weights}" ) if include_top and not num_classes: diff --git a/keras_cv/models/weights.py b/keras_cv/models/weights.py index 08053e1e1b..ea31c46839 100644 --- a/keras_cv/models/weights.py +++ b/keras_cv/models/weights.py @@ -39,7 +39,7 @@ def parse_weights(weights, include_top, model_type): raise ValueError( "The `weights` argument should be either `None`, a the path to the " "weights file to be loaded, or the name of pre-trained weights from " - "https://github.com/keras-team/keras-cv/blob/master/keras_cv/models/weights.py. " + "https://github.com/keras-team/keras-cv/blob/master/keras_cv/models/weights.py. " # noqa: E501 f"Invalid `weights` argument: {weights}" ) @@ -130,86 +130,86 @@ def parse_weights(weights, include_top, model_type): WEIGHTS_CONFIG = { "convmixer_512_16": { - "imagenet/classification-v0": "861f3080dc383f7936d3df89691aadea05eee6acaa4a0b60aa70dd657df915ee", - "imagenet/classification-v0-notop": "aa08c7fa9ca6ec045c4783e1248198dbe1bc141e2ae788e712de471c0370822c", + "imagenet/classification-v0": "861f3080dc383f7936d3df89691aadea05eee6acaa4a0b60aa70dd657df915ee", # noqa: E501 + "imagenet/classification-v0-notop": "aa08c7fa9ca6ec045c4783e1248198dbe1bc141e2ae788e712de471c0370822c", # noqa: E501 }, "cspdarknetl": { - "imagenet/classification-v0": "8bdc3359222f0d26f77aa42c4e97d67a05a1431fe6c448ceeab9a9c5a34ff804", - "imagenet/classification-v0-notop": "9303aabfadffbff8447171fce1e941f96d230d8f3cef30d3f05a9c85097f8f1e", + "imagenet/classification-v0": "8bdc3359222f0d26f77aa42c4e97d67a05a1431fe6c448ceeab9a9c5a34ff804", # noqa: E501 + "imagenet/classification-v0-notop": "9303aabfadffbff8447171fce1e941f96d230d8f3cef30d3f05a9c85097f8f1e", # noqa: E501 }, "cspdarknettiny": { - "imagenet/classification-v0": "c17fe6d7b597f2eb25e42fbd97ec58fb1dad753ba18920cc27820953b7947704", - "imagenet/classification-v0-notop": "0007ae82c95be4d4aef06368a7c38e006381324d77e5df029b04890e18a8ad19", + "imagenet/classification-v0": "c17fe6d7b597f2eb25e42fbd97ec58fb1dad753ba18920cc27820953b7947704", # noqa: E501 + "imagenet/classification-v0-notop": "0007ae82c95be4d4aef06368a7c38e006381324d77e5df029b04890e18a8ad19", # noqa: E501 }, "darknet53": { - "imagenet/classification-v0": "7bc5589f7f7f7ee3878e61ab9323a71682bfb617eb57f530ca8757c742f00c77", - "imagenet/classification-v0-notop": "8dcce43163e4b4a63e74330ba1902e520211db72d895b0b090b6bfe103e7a8a5", + "imagenet/classification-v0": "7bc5589f7f7f7ee3878e61ab9323a71682bfb617eb57f530ca8757c742f00c77", # noqa: E501 + "imagenet/classification-v0-notop": "8dcce43163e4b4a63e74330ba1902e520211db72d895b0b090b6bfe103e7a8a5", # noqa: E501 }, "deeplabv3": { - "voc/segmentation-v0": "732042e8b6c9ddba3d51c861f26dc41865187e9f85a0e5d43dfef75a405cca18", + "voc/segmentation-v0": "732042e8b6c9ddba3d51c861f26dc41865187e9f85a0e5d43dfef75a405cca18", # noqa: E501 }, "densenet121": { - "imagenet/classification-v0": "13de3d077ad9d9816b9a0acc78215201d9b6e216c7ed8e71d69cc914f8f0775b", - "imagenet/classification-v0-notop": "709afe0321d9f2b2562e562ff9d0dc44cca10ed09e0e2cfba08d783ff4dab6bf", + "imagenet/classification-v0": "13de3d077ad9d9816b9a0acc78215201d9b6e216c7ed8e71d69cc914f8f0775b", # noqa: E501 + "imagenet/classification-v0-notop": "709afe0321d9f2b2562e562ff9d0dc44cca10ed09e0e2cfba08d783ff4dab6bf", # noqa: E501 }, "densenet169": { - "imagenet/classification-v0": "4cd2a661d0cb2378574073b23129ee4d06ea53c895c62a8863c44ee039e236a1", - "imagenet/classification-v0-notop": "a99d1bb2cbe1a59a1cdd1f435fb265453a97c2a7b723d26f4ebee96e5fb49d62", + "imagenet/classification-v0": "4cd2a661d0cb2378574073b23129ee4d06ea53c895c62a8863c44ee039e236a1", # noqa: E501 + "imagenet/classification-v0-notop": "a99d1bb2cbe1a59a1cdd1f435fb265453a97c2a7b723d26f4ebee96e5fb49d62", # noqa: E501 }, "densenet201": { - "imagenet/classification-v0": "3b6032e744e5e5babf7457abceaaba11fcd449fe2d07016ae5076ac3c3c6cf0c", - "imagenet/classification-v0-notop": "c1189a934f12c1a676a9cf52238e5994401af925e2adfc0365bad8133c052060", + "imagenet/classification-v0": "3b6032e744e5e5babf7457abceaaba11fcd449fe2d07016ae5076ac3c3c6cf0c", # noqa: E501 + "imagenet/classification-v0-notop": "c1189a934f12c1a676a9cf52238e5994401af925e2adfc0365bad8133c052060", # noqa: E501 }, "efficientnetv2b0": { - "imagenet/classification-v0": "dbde38e7c56af5bdafe61fd798cf5d490f3c5e3b699da7e25522bc828d208984", - "imagenet/classification-v0-notop": "ac95f13a8ad1cee41184fc16fd0eb769f7c5b3131151c6abf7fcee5cc3d09bc8", + "imagenet/classification-v0": "dbde38e7c56af5bdafe61fd798cf5d490f3c5e3b699da7e25522bc828d208984", # noqa: E501 + "imagenet/classification-v0-notop": "ac95f13a8ad1cee41184fc16fd0eb769f7c5b3131151c6abf7fcee5cc3d09bc8", # noqa: E501 }, "efficientnetv2b1": { - "imagenet/classification-v0": "9dd8f3c8de3bbcc269a1b9aed742bb89d56be445b6aa271aa6037644f4210e9a", - "imagenet/classification-v0-notop": "82da111f8411f47e3f5eef090da76340f38e222f90a08bead53662f2ebafb01c", + "imagenet/classification-v0": "9dd8f3c8de3bbcc269a1b9aed742bb89d56be445b6aa271aa6037644f4210e9a", # noqa: E501 + "imagenet/classification-v0-notop": "82da111f8411f47e3f5eef090da76340f38e222f90a08bead53662f2ebafb01c", # noqa: E501 }, "efficientnetv2b2": { - "imagenet/classification-v0": "05eb5674e0ecbf34d5471f611bcfa5da0bb178332dc4460c7a911d68f9a2fe87", - "imagenet/classification-v0-notop": "02d12c9d1589b540b4e84ffdb54ff30c96099bd59e311a85ddc7180efc65e955", + "imagenet/classification-v0": "05eb5674e0ecbf34d5471f611bcfa5da0bb178332dc4460c7a911d68f9a2fe87", # noqa: E501 + "imagenet/classification-v0-notop": "02d12c9d1589b540b4e84ffdb54ff30c96099bd59e311a85ddc7180efc65e955", # noqa: E501 }, "efficientnetv2s": { - "imagenet/classification-v0": "2259db3483a577b5473dd406d1278439bd1a704ee477ff01a118299b134bd4db", - "imagenet/classification-v0-notop": "80555436ea49100893552614b4dce98de461fa3b6c14f8132673817d28c83654", + "imagenet/classification-v0": "2259db3483a577b5473dd406d1278439bd1a704ee477ff01a118299b134bd4db", # noqa: E501 + "imagenet/classification-v0-notop": "80555436ea49100893552614b4dce98de461fa3b6c14f8132673817d28c83654", # noqa: E501 }, "resnet50": { - "imagenet/classification-v0": "1525dc1ce580239839ba6848c0f1b674dc89cb9ed73c4ed49eba355b35eac3ce", - "imagenet/classification-v0-notop": "dc5f6d8f929c78d0fc192afecc67b11ac2166e9d8b9ef945742368ae254c07af", + "imagenet/classification-v0": "1525dc1ce580239839ba6848c0f1b674dc89cb9ed73c4ed49eba355b35eac3ce", # noqa: E501 + "imagenet/classification-v0-notop": "dc5f6d8f929c78d0fc192afecc67b11ac2166e9d8b9ef945742368ae254c07af", # noqa: E501 }, "resnet50v2": { - "imagenet/classification-v0": "11bde945b54d1dca65101be2648048abca8a96a51a42820d87403486389790db", - "imagenet/classification-v0-notop": "5b4aca4932c433d84f6aef58135472a4312ed2fa565d53fedcd6b0c24b54ab4a", - "imagenet/classification-v1": "a32e5d9998e061527f6f947f36d8e794ad54dad71edcd8921cda7804912f3ee7", - "imagenet/classification-v1-notop": "ac46b82c11070ab2f69673c41fbe5039c9eb686cca4f34cd1d79412fd136f1ae", - "imagenet/classification-v2": "5ee5a8ac650aaa59342bc48ffe770e6797a5550bcc35961e1d06685292c15921", - "imagenet/classification-v2-notop": "e711c83d6db7034871f6d345a476c8184eab99dbf3ffcec0c1d8445684890ad9", + "imagenet/classification-v0": "11bde945b54d1dca65101be2648048abca8a96a51a42820d87403486389790db", # noqa: E501 + "imagenet/classification-v0-notop": "5b4aca4932c433d84f6aef58135472a4312ed2fa565d53fedcd6b0c24b54ab4a", # noqa: E501 + "imagenet/classification-v1": "a32e5d9998e061527f6f947f36d8e794ad54dad71edcd8921cda7804912f3ee7", # noqa: E501 + "imagenet/classification-v1-notop": "ac46b82c11070ab2f69673c41fbe5039c9eb686cca4f34cd1d79412fd136f1ae", # noqa: E501 + "imagenet/classification-v2": "5ee5a8ac650aaa59342bc48ffe770e6797a5550bcc35961e1d06685292c15921", # noqa: E501 + "imagenet/classification-v2-notop": "e711c83d6db7034871f6d345a476c8184eab99dbf3ffcec0c1d8445684890ad9", # noqa: E501 }, "vittiny16": { - "imagenet/classification-v0": "c8227fde16ec8c2e7ab886169b11b4f0ca9af2696df6d16767db20acc9f6e0dd", - "imagenet/classification-v0-notop": "aa4d727e3c6bd30b20f49d3fa294fb4bbef97365c7dcb5cee9c527e4e83c8f5b", + "imagenet/classification-v0": "c8227fde16ec8c2e7ab886169b11b4f0ca9af2696df6d16767db20acc9f6e0dd", # noqa: E501 + "imagenet/classification-v0-notop": "aa4d727e3c6bd30b20f49d3fa294fb4bbef97365c7dcb5cee9c527e4e83c8f5b", # noqa: E501 }, "vits16": { - "imagenet/classification-v0": "4a66a1a70a879ff33a3ca6ca30633b9eadafea84b421c92174557eee83e088b5", - "imagenet/classification-v0-notop": "8d0111eda6692096676a5453abfec5d04c79e2de184b04627b295f10b1949745", + "imagenet/classification-v0": "4a66a1a70a879ff33a3ca6ca30633b9eadafea84b421c92174557eee83e088b5", # noqa: E501 + "imagenet/classification-v0-notop": "8d0111eda6692096676a5453abfec5d04c79e2de184b04627b295f10b1949745", # noqa: E501 }, "vitb16": { - "imagenet/classification-v0": "6ab4e08c773e08de42023d963a97e905ccba710e2c05ef60c0971978d4a8c41b", - "imagenet/classification-v0-notop": "4a1bdd32889298471cb4f30882632e5744fd519bf1a1525b1fa312fe4ea775ed", + "imagenet/classification-v0": "6ab4e08c773e08de42023d963a97e905ccba710e2c05ef60c0971978d4a8c41b", # noqa: E501 + "imagenet/classification-v0-notop": "4a1bdd32889298471cb4f30882632e5744fd519bf1a1525b1fa312fe4ea775ed", # noqa: E501 }, "vitl16": { - "imagenet/classification-v0": "5a98000f848f2e813ea896b2528983d8d956f8c4b76ceed0b656219d5b34f7fb", - "imagenet/classification-v0-notop": "40d237c44f14d20337266fce6192c00c2f9b890a463fd7f4cb17e8e35b3f5448", + "imagenet/classification-v0": "5a98000f848f2e813ea896b2528983d8d956f8c4b76ceed0b656219d5b34f7fb", # noqa: E501 + "imagenet/classification-v0-notop": "40d237c44f14d20337266fce6192c00c2f9b890a463fd7f4cb17e8e35b3f5448", # noqa: E501 }, "vits32": { - "imagenet/classification-v0": "f5836e3aff2bab202eaee01d98337a08258159d3b718e0421834e98b3665e10a", - "imagenet/classification-v0-notop": "f3907845eff780a4d29c1c56e0ae053411f02fff6fdce1147c4c3bb2124698cd", + "imagenet/classification-v0": "f5836e3aff2bab202eaee01d98337a08258159d3b718e0421834e98b3665e10a", # noqa: E501 + "imagenet/classification-v0-notop": "f3907845eff780a4d29c1c56e0ae053411f02fff6fdce1147c4c3bb2124698cd", # noqa: E501 }, "vitb32": { - "imagenet/classification-v0": "73025caa78459dc8f9b1de7b58f1d64e24a823f170d17e25fcc8eb6179bea179", - "imagenet/classification-v0-notop": "f07b80c03336d731a2a3a02af5cac1e9fc9aa62659cd29e2e7e5c7474150cc71", + "imagenet/classification-v0": "73025caa78459dc8f9b1de7b58f1d64e24a823f170d17e25fcc8eb6179bea179", # noqa: E501 + "imagenet/classification-v0-notop": "f07b80c03336d731a2a3a02af5cac1e9fc9aa62659cd29e2e7e5c7474150cc71", # noqa: E501 }, } diff --git a/keras_cv/ops/iou_3d.py b/keras_cv/ops/iou_3d.py index d2f6edb22e..e39417d2e2 100644 --- a/keras_cv/ops/iou_3d.py +++ b/keras_cv/ops/iou_3d.py @@ -22,8 +22,8 @@ def iou_3d(y_true, y_pred): """Implements IoU computation for 3D upright rotated bounding boxes. - Note that this is implemented using a custom TensorFlow op. If you don't have - KerasCV installed with custom ops, calling this will fail. + Note that this is implemented using a custom TensorFlow op. If you don't + have KerasCV installed with custom ops, calling this will fail. Boxes should have the format CENTER_XYZ_DXDYDZ_PHI. Refer to https://github.com/keras-team/keras-cv/blob/master/keras_cv/bounding_box_3d/formats.py diff --git a/keras_cv/ops/iou_3d_test.py b/keras_cv/ops/iou_3d_test.py index 0a3208ec2b..c87787f773 100644 --- a/keras_cv/ops/iou_3d_test.py +++ b/keras_cv/ops/iou_3d_test.py @@ -33,17 +33,20 @@ def testOpCall(self): # 0: a 2x2x2 box centered at 0,0,0, rotated 0 degrees # 1: a 2x2x2 box centered at 1,1,1, rotated 135 degrees # Ground Truth boxes: - # 0: a 2x2x2 box centered at 1,1,1, rotated 45 degrees (idential to predicted box 1) + # 0: a 2x2x2 box centered at 1,1,1, rotated 45 degrees + # (idential to predicted box 1) # 1: a 2x2x2 box centered at 1,1,1, rotated 0 degrees box_preds = [[0, 0, 0, 2, 2, 2, 0], [1, 1, 1, 2, 2, 2, 3 * math.pi / 4]] box_gt = [[1, 1, 1, 2, 2, 2, math.pi / 4], [1, 1, 1, 2, 2, 2, 0]] - # Predicted box 0 and both ground truth boxes overlap by 1/8th of the box. - # Therefore, IiU is 1/15 - # Predicted box 1 is the same as ground truth box 0, therefore IoU is 1 - # Predicted box 1 shares an origin with ground truth box 1, but is rotated by 135 degrees. - # Their IoU can be reduced to that of two overlapping squares that share a center with - # the same offset of 135 degrees, which reduces to the square root of 0.5. + # Predicted box 0 and both ground truth boxes overlap by 1/8th of the + # box. Therefore, IiU is 1/15. + # Predicted box 1 is the same as ground truth box 0, therefore IoU is 1. + # Predicted box 1 shares an origin with ground truth box 1, but is + # rotated by 135 degrees. + # Their IoU can be reduced to that of two overlapping squares that + # share a center with the same offset of 135 degrees, which reduces to + # the square root of 0.5. expected_ious = [[1 / 15, 1 / 15], [1, 0.5**0.5]] self.assertAllClose(iou_3d(box_preds, box_gt), expected_ious) diff --git a/keras_cv/point_cloud/point_cloud.py b/keras_cv/point_cloud/point_cloud.py index 246ce38a7d..50a72d6b84 100644 --- a/keras_cv/point_cloud/point_cloud.py +++ b/keras_cv/point_cloud/point_cloud.py @@ -51,10 +51,9 @@ def within_box3d_index(points, boxes): return tf.concat(results, axis=0) else: raise ValueError( - "is_within_box3d_v2 are expecting inputs point clouds and bounding boxes to " - "be rank 2D (Point, Feature) or 3D (Frame, Point, Feature) tensors. Got shape: {} and {}".format( - points.shape, boxes.shape - ) + "is_within_box3d_v2 are expecting inputs point clouds and bounding " + "boxes to be rank 2D (Point, Feature) or 3D (Frame, Point, Feature)" + " tensors. Got shape: {} and {}".format(points.shape, boxes.shape) ) @@ -69,8 +68,8 @@ def group_points_by_boxes(points, boxes): dy, dz, phi]. Returns: - boolean Ragged Tensor of shape [..., num_boxes, ragged_points] for each box, all - the point indices that belong to the box. + boolean Ragged Tensor of shape [..., num_boxes, ragged_points] for each + box, all the point indices that belong to the box. """ num_boxes = boxes.get_shape().as_list()[-2] or tf.shape(boxes)[-2] @@ -207,7 +206,8 @@ def _center_xyzWHD_to_corner_xyz(boxes): boxes: [..., num_boxes, 7] float32 Tensor for 3d boxes in [x, y, z, dx, dy, dz, phi]. Returns: - corners: [..., num_boxes, 8, 3] float32 Tensor for 3d corners in [x, y, z]. + corners: [..., num_boxes, 8, 3] float32 Tensor for 3d corners in + [x, y, z]. """ # relative corners w.r.t to origin point # this will return all corners in top-down counter clockwise instead of @@ -418,8 +418,8 @@ def coordinate_transform(points, pose): Args: points: Float shape [..., 3]: Points to transform to new coordinates. pose: Float shape [6]: [translate_x, translate_y, translate_z, yaw, roll, - pitch]. The pose in the frame that 'points' comes from, and the definition - of the rotation and translation angles to apply to points. + pitch]. The pose in the frame that 'points' comes from, and the + definition of the rotation and translation angles to apply to points. Returns: 'points' transformed to the coordinates defined by 'pose'. """ @@ -449,13 +449,13 @@ def spherical_coordinate_transform(points): https://en.wikipedia.org/wiki/Spherical_coordinate_system#Coordinate_system_conversions for definitions of the transformations. Args: - points_xyz: A floating point tensor with shape [..., 3], where the inner 3 + points: A floating point tensor with shape [..., 3], where the inner 3 dimensions correspond to xyz coordinates. Returns: A floating point tensor with the same shape [..., 3], where the inner dimensions correspond to (dist, theta, phi), where phi corresponds to - azimuth/yaw (rotation around z), and theta corresponds to pitch/inclination - (rotation around y). + azimuth/yaw (rotation around z), and theta corresponds to + pitch/inclination (rotation around y). """ dist = tf.sqrt(tf.reduce_sum(tf.square(points), axis=-1)) theta = tf.acos(points[..., 2] / tf.maximum(dist, 1e-7)) @@ -466,10 +466,11 @@ def spherical_coordinate_transform(points): def within_a_frustum(points, center, r_distance, theta_width, phi_width): """Check if 3d points are within a 3d frustum. - https://en.wikipedia.org/wiki/Spherical_coordinate_system for definitions of r, theta, and phi. - https://en.wikipedia.org/wiki/Viewing_frustum for defination of a viewing frustum. Here, we - use a conical shaped frustum (https://mathworld.wolfram.com/ConicalFrustum.html). - Currently only xyz format is supported. + https://en.wikipedia.org/wiki/Spherical_coordinate_system for definitions of + r, theta, and phi. https://en.wikipedia.org/wiki/Viewing_frustum for + definition of a viewing frustum. Here, we use a conical shaped frustum + (https://mathworld.wolfram.com/ConicalFrustum.html). Currently, only xyz + format is supported. Args: points: [num_points, 3] float32 Tensor for 3d points in xyz format. @@ -494,7 +495,8 @@ def within_a_frustum(points, center, r_distance, theta_width, phi_width): theta_half_width = theta_width / 2.0 phi_half_width = phi_width / 2.0 - # Points within theta and phi width and further than r distance are selected. + # Points within theta and phi width and + # further than r distance are selected. in_theta_width = (theta < (center_theta + theta_half_width)) & ( theta > (center_theta - theta_half_width) ) diff --git a/keras_cv/point_cloud/point_cloud_test.py b/keras_cv/point_cloud/point_cloud_test.py index dc0e49bbbb..3caee9b530 100644 --- a/keras_cv/point_cloud/point_cloud_test.py +++ b/keras_cv/point_cloud/point_cloud_test.py @@ -251,9 +251,9 @@ def testCoordinateTransform(self): result = point_cloud.coordinate_transform(replicated_points, pose) - # We expect the point to be translated close to the car, and then rotated - # mostly around the x-axis. - # the result is device dependent, skip or ignore this test locally if it fails. + # We expect the point to be translated close to the car, and then + # rotated mostly around the x-axis. The result is device dependent, skip + # or ignore this test locally if it fails. expected = np.tile([[[-8.184512, -0.13086952, -0.04200769]]], [2, 4, 1]) self.assertAllClose(expected, result) diff --git a/keras_cv/training/contrastive/contrastive_trainer.py b/keras_cv/training/contrastive/contrastive_trainer.py index f1c98d99f9..f2204fc202 100644 --- a/keras_cv/training/contrastive/contrastive_trainer.py +++ b/keras_cv/training/contrastive/contrastive_trainer.py @@ -25,12 +25,13 @@ class ContrastiveTrainer(keras.Model): Args: encoder: a `keras.Model` to be pre-trained. In most cases, this encoder should not include a top dense layer. - augmenter: a preprocessing layer to randomly augment input images for contrastive learning, - or a tuple of two separate augmenters for the two sides of the contrastive pipeline. - projector: a projection model for contrastive training, or a tuple of two separate - projectors for the two sides of the contrastive pipeline. This shrinks - the feature map produced by the encoder, and is usually a 1 or - 2-layer dense MLP. + augmenter: a preprocessing layer to randomly augment input images for + contrastive learning, or a tuple of two separate augmenters for the + two sides of the contrastive pipeline. + projector: a projection model for contrastive training, or a tuple of + two separate projectors for the two sides of the contrastive + pipeline. This shrinks the feature map produced by the encoder, and + is usually a 1 or 2-layer dense MLP. probe: An optional Keras layer or model which will be trained against class labels at train-time using the encoder output as input. Note that this should be specified iff training with labeled images. @@ -43,7 +44,10 @@ class labels at train-time using the encoder output as input. Usage: ```python - encoder = keras_cv.models.DenseNet121(include_rescaling=True, include_top=False, pooling="avg") + encoder = keras_cv.models.DenseNet121( + include_rescaling=True, + include_top=False, + pooling="avg") augmenter = keras_cv.layers.preprocessing.RandomFlip() projector = keras.layers.Dense(64) probe = keras_cv.training.ContrastiveTrainer.linear_probe(num_classes=10) @@ -82,17 +86,21 @@ def __init__( if encoder.output.shape.rank != 2: raise ValueError( - f"`encoder` must have a flattened output. Expected rank(encoder.output.shape)=2, got encoder.output.shape={encoder.output.shape}" + f"`encoder` must have a flattened output. Expected " + f"rank(encoder.output.shape)=2, got " + f"encoder.output.shape={encoder.output.shape}" ) if type(augmenter) is tuple and len(augmenter) != 2: raise ValueError( - "`augmenter` must be either a single augmenter or a tuple of exactly 2 augmenters." + "`augmenter` must be either a single augmenter or a tuple of " + "exactly 2 augmenters." ) if type(projector) is tuple and len(projector) != 2: raise ValueError( - "`projector` must be either a single augmenter or a tuple of exactly 2 augmenters." + "`projector` must be either a single augmenter or a tuple of " + "exactly 2 augmenters." ) self.augmenters = ( @@ -139,17 +147,22 @@ def compile( if "loss" in kwargs: raise ValueError( - "`loss` parameter in ContrastiveTrainer.compile is ambiguous. Please specify `encoder_loss` or `probe_loss`." + "`loss` parameter in ContrastiveTrainer.compile is ambiguous. " + "Please specify `encoder_loss` or `probe_loss`." ) if "optimizer" in kwargs: raise ValueError( - "`optimizer` parameter in ContrastiveTrainer.compile is ambiguous. Please specify `encoder_optimizer` or `probe_optimizer`." + "`optimizer` parameter in ContrastiveTrainer.compile is " + "ambiguous. Please specify `encoder_optimizer` or " + "`probe_optimizer`." ) if "metrics" in kwargs: raise ValueError( - "`metrics` parameter in ContrastiveTrainer.compile is ambiguous. Please specify `encoder_metrics` or `probe_metrics`." + "`metrics` parameter in ContrastiveTrainer.compile is " + "ambiguous. Please specify `encoder_metrics` or " + "`probe_metrics`." ) if self.probe: @@ -255,7 +268,8 @@ def train_step(self, data): def call(self, inputs): raise NotImplementedError( - "ContrastiveTrainer.call() is not implemented - please call your model directly." + "ContrastiveTrainer.call() is not implemented - " + "please call your model directly." ) @staticmethod diff --git a/keras_cv/utils/fill_utils.py b/keras_cv/utils/fill_utils.py index 46292a8633..63e06fc69d 100644 --- a/keras_cv/utils/fill_utils.py +++ b/keras_cv/utils/fill_utils.py @@ -34,10 +34,10 @@ def corners_to_mask(bounding_boxes, mask_shape): """Converts bounding boxes in corners format to boolean masks Args: - bounding_boxes: tensor of rectangle coordinates with shape (batch_size, 4) in - corners format (x0, y0, x1, y1). - mask_shape: a tuple or list of shape (width, height) indicating the output - width and height of masks. + bounding_boxes: tensor of rectangle coordinates with shape + (batch_size, 4) in corners format (x0, y0, x1, y1). + mask_shape: a tuple or list of shape (width, height) indicating the + output width and height of masks. Returns: boolean masks with shape (batch_size, width, height) where True values @@ -59,12 +59,12 @@ def fill_rectangle(images, centers_x, centers_y, widths, heights, fill_values): """Fill rectangles with fill value into images. Args: - images: Tensor of images to fill rectangles into. - centers_x: Tensor of positions of the rectangle centers on the x-axis. - centers_y: Tensor of positions of the rectangle centers on the y-axis. + images: Tensor of images to fill rectangles into + centers_x: Tensor of positions of the rectangle centers on the x-axis + centers_y: Tensor of positions of the rectangle centers on the y-axis widths: Tensor of widths of the rectangles heights: Tensor of heights of the rectangles - fill_values: Tensor with same shape as images to get rectangle fill from. + fill_values: Tensor with same shape as images to get rectangle fill from Returns: images with filled rectangles. """ diff --git a/keras_cv/utils/preprocessing.py b/keras_cv/utils/preprocessing.py index 7048977fd0..97456b3742 100644 --- a/keras_cv/utils/preprocessing.py +++ b/keras_cv/utils/preprocessing.py @@ -55,14 +55,14 @@ def transform_value_range( ): """transforms values in input tensor from original_range to target_range. This function is intended to be used in preprocessing layers that - rely upon color values. This allows us to assume internally that + rely upon color values. This allows us to assume internally that the input tensor is always in the range [0, 255]. Args: - images: the set of images to transform to the target range range. + images: the set of images to transform to the target range. original_range: the value range to transform from. target_range: the value range to transform to. - dtype: the dtype to compute the conversion with. Defaults to tf.float32. + dtype: the dtype to compute the conversion with, defaults to tf.float32. Returns: a new Tensor with values in the target range. @@ -117,10 +117,10 @@ def _unwrap_value_range(value_range, dtype=tf.float32): def blend(image1: tf.Tensor, image2: tf.Tensor, factor: float) -> tf.Tensor: """Blend image1 and image2 using 'factor'. - FactorSampler should be in the range [0, 1]. A value of 0.0 means only image1 - is used. A value of 1.0 means only image2 is used. A value between 0.0 - and 1.0 means we linearly interpolate the pixel values between the two - images. A value greater than 1.0 "extrapolates" the difference + FactorSampler should be in the range [0, 1]. A value of 0.0 means only + image1 is used. A value of 1.0 means only image2 is used. A value between + 0.0 and 1.0 means we linearly interpolate the pixel values between the two + images. A value greater than 1.0 "extrapolates" the difference between the two pixel values, and we clip the results to values between 0 and 255. Args: @@ -153,15 +153,15 @@ def parse_factor( if param[0] > param[1]: raise ValueError( - f"`{param_name}[0] > {param_name}[1]`, `{param_name}[0]` must be <= " - f"`{param_name}[1]`. Got `{param_name}={param}`" + f"`{param_name}[0] > {param_name}[1]`, `{param_name}[0]` must be " + f"<= `{param_name}[1]`. Got `{param_name}={param}`" ) if (min_value is not None and param[0] < min_value) or ( max_value is not None and param[1] > max_value ): raise ValueError( - f"`{param_name}` should be inside of range [{min_value}, {max_value}]. " - f"Got {param_name}={param}" + f"`{param_name}` should be inside of range " + f"[{min_value}, {max_value}]. Got {param_name}={param}" ) if param[0] == param[1]: @@ -176,11 +176,12 @@ def random_inversion(random_generator): This can be used by KPLs to randomly invert sampled values. Args: - random_generator: a Keras random number generator. An instance can be passed - from the `self._random_generator` attribute of a `BaseImageAugmentationLayer`. + random_generator: a Keras random number generator. An instance can be + passed from the `self._random_generator` attribute of + a `BaseImageAugmentationLayer`. Returns: - either -1, or -1. + either -1, or -1. """ negate = random_generator.random_uniform((), 0, 1, dtype=tf.float32) > 0.5 negate = tf.cond(negate, lambda: -1.0, lambda: 1.0) @@ -199,19 +200,19 @@ def batch_random_inversion(random_generator, batch_size): def get_rotation_matrix(angles, image_height, image_width, name=None): """Returns projective transform(s) for the given angle(s). Args: - angles: A scalar angle to rotate all images by, or (for batches of images) a - vector with an angle to rotate each image in the batch. The rank must be - statically known (the shape is not `TensorShape(None)`). + angles: A scalar angle to rotate all images by, or (for batches of images) + a vector with an angle to rotate each image in the batch. The rank + must be statically known (the shape is not `TensorShape(None)`). image_height: Height of the image(s) to be transformed. image_width: Width of the image(s) to be transformed. name: The name of the op. Returns: - A tensor of shape (num_images, 8). Projective transforms which can be given - to operation `image_projective_transform_v2`. If one row of transforms is - [a0, a1, a2, b0, b1, b2, c0, c1], then it maps the *output* point - `(x, y)` to a transformed *input* point - `(x', y') = ((a0 x + a1 y + a2) / k, (b0 x + b1 y + b2) / k)`, - where `k = c0 x + c1 y + 1`. + A tensor of shape (num_images, 8). Projective transforms which can be + given to operation `image_projective_transform_v2`. If one row of + transforms is [a0, a1, a2, b0, b1, b2, c0, c1], then it maps the + *output* point `(x, y)` to a transformed *input* point + `(x', y') = ((a0 x + a1 y + a2) / k, (b0 x + b1 y + b2) / k)`, + where `k = c0 x + c1 y + 1`. """ with backend.name_scope(name or "rotation_matrix"): x_offset = ( @@ -250,8 +251,8 @@ def get_translation_matrix(translations, name=None): to translate for each image (for a batch of images). name: The name of the op. Returns: - A tensor of shape `(num_images, 8)` projective transforms which can be given - to `transform`. + A tensor of shape `(num_images, 8)` projective transforms which can be + given to `transform`. """ with backend.name_scope(name or "translation_matrix"): num_translations = tf.shape(translations)[0] @@ -288,19 +289,20 @@ def transform( Args: images: A tensor of shape - `(num_images, num_rows, num_columns, num_channels)` (NHWC). The rank must - be statically known (the shape is not `TensorShape(None)`). + `(num_images, num_rows, num_columns, num_channels)` (NHWC). The rank + must be statically known (the shape is not `TensorShape(None)`). transforms: Projective transform matrix/matrices. A vector of length 8 or - tensor of size N x 8. If one row of transforms is [a0, a1, a2, b0, b1, b2, - c0, c1], then it maps the *output* point `(x, y)` to a transformed *input* - point `(x', y') = ((a0 x + a1 y + a2) / k, (b0 x + b1 y + b2) / k)`, where + tensor of size N x 8. If one row of transforms is + [a0, a1, a2, b0, b1, b2, c0, c1], then it maps the *output* point + `(x, y)` to a transformed *input* point + `(x', y') = ((a0 x + a1 y + a2) / k, (b0 x + b1 y + b2) / k)`, where `k = c0 x + c1 y + 1`. The transforms are *inverted* compared to the transform mapping input points to output points. Note that gradients are not backpropagated into transformation parameters. fill_mode: Points outside the boundaries of the input are filled according to the given mode (one of `{"constant", "reflect", "wrap", "nearest"}`). - fill_value: a float represents the value to be filled outside the boundaries - when `fill_mode="constant"`. + fill_value: a float represents the value to be filled outside the + boundaries when `fill_mode="constant"`. interpolation: Interpolation mode. Supported values: `"nearest"`, `"bilinear"`. output_shape: Output dimension after the transform, `[height, width]`. diff --git a/keras_cv/utils/resource_loader.py b/keras_cv/utils/resource_loader.py index c0d6f2bda2..808b3b673b 100644 --- a/keras_cv/utils/resource_loader.py +++ b/keras_cv/utils/resource_loader.py @@ -66,14 +66,15 @@ def display_warning_if_incompatible(self): user_version = tf.__version__ warnings.warn( - f"You are currently using TensorFlow {user_version} and trying to load a KerasCV custom op." - "\n" - f"KerasCV has compiled its custom ops against TensorFlow {TF_VERSION_FOR_ABI_COMPATIBILITY}, " - "and there are no compatibility guarantees between the two versions. " - "\n" - "This means that you might get segfaults when loading the custom op, " - "or other kind of low-level errors.\n If you do, do not file an issue " - "on Github. This is a known limitation.", + f"You are currently using TensorFlow {user_version} and " + f"trying to load a KerasCV custom op.\n" + f"KerasCV has compiled its custom ops against TensorFlow " + f"{TF_VERSION_FOR_ABI_COMPATIBILITY}, and there are no " + f"compatibility guarantees between the two versions.\n" + "This means that you might get segfaults when loading the custom " + "op, or other kind of low-level errors.\n" + "If you do, do not file an issue on Github. " + "This is a known limitation.", UserWarning, ) abi_warning_already_raised = True diff --git a/keras_cv/utils/target_gather.py b/keras_cv/utils/target_gather.py index 3492bc64fb..0f51c2595c 100644 --- a/keras_cv/utils/target_gather.py +++ b/keras_cv/utils/target_gather.py @@ -29,18 +29,19 @@ def _target_gather( Args: targets: [N, ...] or [batch_size, N, ...] Tensor representing targets such - as boxes, keypoints, etc. + as boxes, keypoints, etc. indices: [M] or [batch_size, M] int32 Tensor representing indices within - `targets` to gather. - mask: optional [M, ...] or [batch_size, M, ...] boolean - Tensor representing the masking for each target. `True` means the corresponding - entity should be masked to `mask_val`, `False` means the corresponding entity - should be the target value. - mask_val: optional float representing the masking value if `mask` is True on - the entity. + `targets` to gather. + mask: optional [M, ...] or [batch_size, M, ...] boolean Tensor representing + the masking for each target. `True` means the corresponding entity + should be masked to `mask_val`, `False` means the corresponding + entity should be the target value. + mask_val: optional float representing the masking value if `mask` is True + on the entity. - Returns: - targets: [M, ...] or [batch_size, M, ...] Tensor representing selected targets. + Returns: + targets: [M, ...] or [batch_size, M, ...] Tensor representing + selected targets. Raise: ValueError: If `targets` is higher than rank 3. diff --git a/keras_cv/utils/test_utils.py b/keras_cv/utils/test_utils.py index 24787007d6..1ee9a1652c 100644 --- a/keras_cv/utils/test_utils.py +++ b/keras_cv/utils/test_utils.py @@ -20,11 +20,12 @@ def exhaustive_compare(obj1, obj2): - """Exhaustively compared config of any two python or Keras objects recursively. + """Exhaustively compared config of any two python + or Keras objects recursively. - If objects are python objects, a standard equality check is run. If the objects are - Keras objects a `get_config()` call is made. The subsequent configs are then - compared to determine if equality holds. + If objects are python objects, a standard equality check is run. + If the objects are Keras objects a `get_config()` call is made. + The subsequent configs are then compared to determine if equality holds. Args: obj1: any object, can be a Keras object or python object. diff --git a/keras_cv/utils/train.py b/keras_cv/utils/train.py index 14ff11be98..3caf58e356 100644 --- a/keras_cv/utils/train.py +++ b/keras_cv/utils/train.py @@ -35,8 +35,9 @@ def convert_inputs_to_tf_dataset( if isinstance(x, tf.data.Dataset): if y is not None or batch_size is not None: raise ValueError( - "When `x` is a `tf.data.Dataset`, please do not provide a value for " - f"`y` or `batch_size`. Got `y={y}`, `batch_size={batch_size}`." + "When `x` is a `tf.data.Dataset`, please do not " + "provide a value for `y` or `batch_size`. " + "Got `y={y}`, `batch_size={batch_size}`." ) return x diff --git a/keras_cv/version_check.py b/keras_cv/version_check.py index 4606cfc4d5..d419594ff8 100644 --- a/keras_cv/version_check.py +++ b/keras_cv/version_check.py @@ -24,8 +24,9 @@ def check_tf_version(): if parse(tf.__version__) < parse(MIN_VERSION): raise RuntimeError( - f"The Tensorflow package version needs to be at least {MIN_VERSION} " - "for KerasCV to run. Currently, your TensorFlow version is " - f"{tf.__version__}. Please upgrade with `$ pip install --upgrade tensorflow`. " - "You can use `pip freeze` to check afterwards that everything is ok." + "The Tensorflow package version needs to be at least " + f"{MIN_VERSION} for KerasCV to run. Currently, your TensorFlow " + f"version is {tf.__version__}. Please upgrade with `$ pip install " + "--upgrade tensorflow`. You can use `pip freeze` to check " + "afterwards that everything is ok." ) diff --git a/keras_cv/visualization/draw_bounding_boxes.py b/keras_cv/visualization/draw_bounding_boxes.py index 8d05fb3a65..3a1b141acb 100644 --- a/keras_cv/visualization/draw_bounding_boxes.py +++ b/keras_cv/visualization/draw_bounding_boxes.py @@ -36,15 +36,15 @@ def draw_bounding_boxes( ): """Internal utility to draw bounding boxes on the target image. - Accepts a batch of images and batch of bounding boxes. The function draws + Accepts a batch of images and batch of bounding boxes. The function draws the bounding boxes onto the image, and returns a new image tensor with the - annotated images. This API is intentionally not exported, and is considered + annotated images. This API is intentionally not exported, and is considered an implementation detail. Args: images: a batch Tensor of images to plot bounding boxes onto. - bounding_boxes: a Tensor of batched bounding boxes to plot onto the provided - images + bounding_boxes: a Tensor of batched bounding boxes to plot onto the + provided images. color: the color in which to plot the bounding boxes bounding_box_format: The format of bounding boxes to plot onto the images. Refer @@ -52,13 +52,14 @@ def draw_bounding_boxes( for more details on supported bounding box formats. line_thickness: (Optional) line_thickness for the box and text labels. Defaults to 2. - text_thickness: (Optional) the lthickness for the text, defaults to `1.0`. - font_scale: (Optional) scale of font to draw in. Defaults to `1.0`. + text_thickness: (Optional) the thickness for the text, defaults to + `1.0`. + font_scale: (Optional) scale of font to draw in, defaults to `1.0`. class_mapping: (Optional) dictionary from class ID to class label. Returns: the input `images` with provided bounding boxes plotted on top of them - """ + """ # noqa: E501 assert_cv2_installed("draw_bounding_boxes") bounding_boxes = bounding_box.convert_format( bounding_boxes, source=bounding_box_format, target="xyxy", images=images @@ -92,7 +93,7 @@ def draw_bounding_boxes( if class_id == -1: continue - # force conversion back to contigous array + # force conversion back to contiguous array x, y, x2, y2 = int(x), int(y), int(x2), int(y2) cv2.rectangle( image, diff --git a/keras_cv/visualization/plot_bounding_box_gallery.py b/keras_cv/visualization/plot_bounding_box_gallery.py index f8b68e94ac..a17fe923e1 100644 --- a/keras_cv/visualization/plot_bounding_box_gallery.py +++ b/keras_cv/visualization/plot_bounding_box_gallery.py @@ -84,29 +84,32 @@ def unpackage_tfds_inputs(inputs): ![Example bounding box gallery](https://i.imgur.com/tJpb8hZ.png) Args: - images: a Tensor or NumPy array containing images to show in the gallery. - value_range: value range of the images. Common examples include `(0, 255)` - and `(0, 1)`. - bounding_box_format: the bounding_box_format the provided bounding boxes are - in. - y_true: (Optional) a KerasCV bounding box dictionary representing the ground truth - bounding boxes. - y_pred: (Optional) a KerasCV bounding box dictionary representing the predicted - bounding boxes. - pred_color: three element tuple representing the color to use for plotting + images: a Tensor or NumPy array containing images to show in the + gallery. + value_range: value range of the images. Common examples include + `(0, 255)` and `(0, 1)`. + bounding_box_format: the bounding_box_format the provided bounding boxes + are in. + y_true: (Optional) a KerasCV bounding box dictionary representing the + ground truth bounding boxes. + y_pred: (Optional) a KerasCV bounding box dictionary representing the predicted bounding boxes. - true_color: three element tuple representing the color to use for plotting - true bounding boxes. + pred_color: three element tuple representing the color to use for + plotting predicted bounding boxes. + true_color: three element tuple representing the color to use for + plotting true bounding boxes. class_mapping: (Optional) class mapping from class IDs to strings - ground_truth_mapping: (Optional) class mapping from class IDs to strings, + ground_truth_mapping: (Optional) class mapping from class IDs to + strings, defaults to `class_mapping` + prediction_mapping: (Optional) class mapping from class IDs to strings, defaults to `class_mapping` - prediction_mapping: (Optional) class mapping from class IDs to strings, - defaults to `class_mapping` - line_thickness: (Optional) line_thickness for the box and text labels. Defaults to 2. - text_thickness: (Optional) the line_thickness for the text, defaults to `1.0`. + line_thickness: (Optional) line_thickness for the box and text labels. + Defaults to 2. + text_thickness: (Optional) the line_thickness for the text, defaults to + `1.0`. font_scale: (Optional) font size to draw bounding boxes in. - legend: Whether or not to create a legend with the specified colors for `y_true` - and `y_pred`. Defaults to False. + legend: whether to create a legend with the specified colors for + `y_true` and `y_pred`, defaults to False. kwargs: keyword arguments to propagate to `keras_cv.visualization.plot_image_gallery()`. """ diff --git a/keras_cv/visualization/plot_image_gallery.py b/keras_cv/visualization/plot_image_gallery.py index 65d4652306..bb9bb062be 100644 --- a/keras_cv/visualization/plot_image_gallery.py +++ b/keras_cv/visualization/plot_image_gallery.py @@ -64,20 +64,22 @@ def unpackage_tfds_inputs(inputs): ![example gallery](https://i.imgur.com/r0ndse0.png) Args: - images: a Tensor or NumPy array containing images to show in the gallery. - value_range: value range of the images. Common examples include `(0, 255)` - and `(0, 1)`. + images: a Tensor or NumPy array containing images to show in the + gallery. + value_range: value range of the images. Common examples include + `(0, 255)` and `(0, 1)`. rows: number of rows in the gallery to show. cols: number of columns in the gallery to show. scale: how large to scale the images in the gallery path: (Optional) path to save the resulting gallery to. - show: (Optional) whether or not to show the gallery of images. - transparent: (Optional) whether or not to give the image a transparent - background. Defaults to `True`. - dpi: (Optional) the dpi to pass to matplotlib.savefig(). Defaults to `60`. + show: (Optional) whether to show the gallery of images. + transparent: (Optional) whether to give the image a transparent + background, defaults to `True`. + dpi: (Optional) the dpi to pass to matplotlib.savefig(), defaults to + `60`. legend_handles: (Optional) matplotlib.patches List of legend handles. - I.e. passing: `[patches.Patch(color='red', label='mylabel')]` will produce - a legend with a single red patch and the label 'mylabel'. + I.e. passing: `[patches.Patch(color='red', label='mylabel')]` will + produce a legend with a single red patch and the label 'mylabel'. """ assert_matplotlib_installed("plot_bounding_box_gallery") diff --git a/setup.cfg b/setup.cfg index 3919775381..7b52664a31 100644 --- a/setup.cfg +++ b/setup.cfg @@ -11,9 +11,9 @@ filterwarnings = ignore::RuntimeWarning ignore::PendingDeprecationWarning ignore::FutureWarning + [flake8] -# Allow --max-line-length=200 to support long links in docstrings -max-line-length = 200 +max-line-length = 80 per-file-ignores = ./keras_cv/__init__.py:E402, F401 ./examples/**/*:E402 diff --git a/shell/weights/remove_top.py b/shell/weights/remove_top.py index bd73fdd57b..da13e5755e 100644 --- a/shell/weights/remove_top.py +++ b/shell/weights/remove_top.py @@ -18,7 +18,8 @@ raise ValueError("Weights path must end in .h5") model = eval( - f"keras_cv.models.{FLAGS.model_name}(include_rescaling=True, include_top=True, num_classes=1000, weights=FLAGS.weights_path)" + f"keras_cv.models.{FLAGS.model_name}(include_rescaling=True, " + f"include_top=True, num_classes=1000, weights=FLAGS.weights_path)" ) without_top = keras.models.Model(model.input, model.layers[-3].output) diff --git a/shell/weights/update_training_history.py b/shell/weights/update_training_history.py index 66006e2691..e075a89803 100644 --- a/shell/weights/update_training_history.py +++ b/shell/weights/update_training_history.py @@ -15,12 +15,14 @@ flags.DEFINE_string( "script_version", None, - "commit hash of the latest commit in KerasCV/master for the training script", + "commit hash of the latest commit in KerasCV/master " + "for the training script", ) flags.DEFINE_string( "weights_version", None, - "The version of the training script used to produce the latest weights. For example, v0", + "The version of the training script used to produce the latest weights. " + "For example, v0", ) flags.DEFINE_string( "contributor", @@ -46,19 +48,21 @@ ) full_training_script_path = os.path.abspath(training_script_path) -# Build an experiment name structured as task/training_script_name/model_name-version +# Build an experiment name. +# This will be structured as task/training_script_name/model_name-version training_script_rooted_at_training = full_training_script_path[ full_training_script_path.index("keras-cv/examples/training/") + 27 : ] training_script_dirs = training_script_rooted_at_training.split("/") -tensorboard_experiment_name = f"{training_script_dirs[0]}/{'/'.join(training_script_dirs[1:])[:-3]}/{model_name}-{weights_version}" +tensorboard_experiment_name = f"{training_script_dirs[0]}/{'/'.join(training_script_dirs[1:])[:-3]}/{model_name}-{weights_version}" # noqa: E501 training_script_json_path = full_training_script_path[ : full_training_script_path.index("keras-cv/examples/training/") + 27 ] + "/".join(training_script_dirs[:2] + ["training_history.json"]) script_version = FLAGS.script_version or input( - "Input the commit hash of the latest commit in KerasCV/master for the training script used for training." + "Input the commit hash of the latest commit in KerasCV/master " + "for the training script used for training." ) tensorboard_logs_path = FLAGS.tensorboard_logs_path or input( @@ -66,7 +70,10 @@ ) tensorboard_experiment_id = ( os.popen( - f"python3 -m tensorboard.main dev upload --logdir {tensorboard_logs_path} --name {tensorboard_experiment_name} --one_shot --verbose 0" + f"python3 -m tensorboard.main dev upload " + f"--logdir {tensorboard_logs_path} " + f"--name {tensorboard_experiment_name} " + f"--one_shot --verbose 0" ) .read() .split("/")[-2] @@ -118,7 +125,8 @@ max_mean_iou = f"{max_mean_iou:.4f}" contributor = FLAGS.contributor or input( - "Input your GitHub username (or the username of the contributor, if it's not you)\n" + "Input your GitHub username " + "(or the username of the contributor, if it's not you)\n" ) accelerators = FLAGS.accelerators or input( @@ -144,7 +152,7 @@ "version": script_version, }, "epochs_trained": training_epochs, - "tensorboard_logs": f"https://tensorboard.dev/experiment/{tensorboard_experiment_id}/", + "tensorboard_logs": f"https://tensorboard.dev/experiment/{tensorboard_experiment_id}/", # noqa: E501 "contributor": contributor, "args": args_dict, "accelerators": int(accelerators),